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Abstract 



The theory of quantum computation is presented in a self contained way from a 
computer science perspective. The basics of classical computation and quantum 
mechanics is reviewed. The circuit model of quantum computation is presented 
in detail. Throughout there is an emphasis on the physical as well as the abstract 
aspects of computation and the interplay between them. 

This report is presented as a Master's thesis at the department of Computer 
Science and Engineering at Goteborg University, Goteborg, Sweden. 

The text is part of a larger work that is planned to include chapters on quan- 
tum algorithms, the quantum Turing machine model and abstract approaches 
to quantum computation. 
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Foreword 



The intended readership for this master's thesis in Computer Science is primarily 

the computer scientist wishing to get an idea of what quantum computing is 
about. But I also have physicists in mind. Therefore, the physicist will find 
material on physics that will appear to be obvious and the computer scientist 
will find material on computers that will likewise appear to be trivial. So perhaps 
the reader who will benefit the most from the text is the one who is unfamiliar 
with both subjects. The point is that I'm actually not writing for the lucky 
few who have expertise in both fields but rather for those who come from either 
field, or from none of them. The text is thus basically introductory, but not 
elementary. 

There is also a further point. Since quantum computation straddles the 
borderline between physics and computing science, it is interesting to spell out 
the basic assumptions and facts of both fields in some detail. 

Obviously, this text can be seen as a review article. But I have no intention 
to treat every aspect of the subject which is simply to vast. The depth of the 
treatment will also vary considerably. Some basic definitions and some, in my 
opinion fundamental, results, will be spelt out in detail, whereas many topics 
that a comprehensive text would treat, will be passed over rapidly. The principle 
behind these choices is that I will attempt to be detailed on issues that has a 
bearing on the connections between physics and computation. What has been 
left out can be found in the textbook literature and original articles on the 
subject as well as in other review articles. 

The text is mostly written in theoretical physics style, introducing no more 
formalism than needed to make the arguments clear. The degree of formal- 
ization will vary. A high level of formalization throughout tends to make the 
text unreadable, whereas a low level of formalization might leave the reader 
unnecessarily confused. Definitions, derivations and results are presented and 
proved in the running text, but occasionally, due to the nature of subject, a 
more formal style will be adopted. I've chosen a level of formalization that I 
found appropriate and in the end it reflects my own taste. 

There are of course lots of review article on quantum computation. I have 
therefore decided not to repeat to much of the standard calculations and deriva- 
tions, instead focusing on what I find interesting, trying to put forward a slightly 
different perspective, and instead being detailed on points that are often glossed 
over. In this respect I hope this text can be a complement to the many excellent 
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books and reviews already in circulation, a few of which are P El 0] E^] ■ 

One seldom learns a subject by reading just one book or just one review 
article. In writing chapter 4 on introduction to quantum mechanics, I realized 
how much is left implicit, even though you try to make the text self contained. If 
you haven't already mastered a subject, perhaps you cannot gain so much from 
just one review - you must read several articles and books to see the subject 
treated in different ways. 

Outline of contents 

Chapter 1 is an introduction to text and a motivation for studying quantum 
computation. Some fundamental questions on the connection between physics 
and computation will be mentioned. They will be returned to in a planned part 
II of this work. 

Chapter 2 is an overview of the central concepts of classical computation such 
as the notions of computational models, computability and complexity theory. 
Together with Chapter 5 on general quantum theory it serves as the foundation 
for a treatment of quantum computational models and quantum algorithms. 

Chapter 3 is a brief introduction to quantum computation. It serves mainly 
as motivating the subsequent two chapters on quantum mechanics. 

Chapter 4 contains a quite extensive introduction to quantum mechanics 
written in a physics style. Three important models are treated in some detail; 
a particle trapped in a potential well, the harmonic oscillator and the theory 
of angular momentum. Apart from being important in quantum physics, these 
models are the standard ones employed when teaching introductory quantum 
mechanics. All concepts of quantum mechanics can be introduced while studying 
these simple models. 

Chapter 5 then sets up the formal theory of quantum mechanics in terms 
of linear operators on Hilbert spaces. After that, the stage is set for treating 
quantum computation. 

Chapter 6 describes in an abstract way the quantum circuit model. 

As this text is mainly on the abstract and theoretical aspects of classical 
and quantum computational models, not very much will be said on practical 
realizations of quantum computing devices, or quantum computers for short. 
Presumably, the theoretical aspects of the subject matter will remain relevant, 
while the practical, implementational details are likely to undergo more dramatic 
change. 

One last remark. My initial intentions was to treat also the Quantum Turing 
machine model and quantum algorithms. However, the scope of the project 
would then have gone beyond the boundaries of a masters project. For this 
reason, these topics will be left for a part II. 
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Chapter 1 

Introduction 



Computer science, and in particular the theory of computation, can be studied 
without exphcit regard to physics. The whole area of research into classical 
computability is phrased without any reference to physics or even real computing 
machines. The related areas of syntax and semantics of programming languages 
make no reference to anything more real than symbol shuffling by abstract 
machines. 

Classical computation is a discrete process. Whether viewed in terms of 
Turing machines, RAM -machines or operational semantics of programming lan- 
guages in terms of abstract stack machines^, it really just amounts to string 
processing or symbol manipulation. The number of symbols is finite and the 
number of basic operations is finite. A program is a finite set of instructions 
in terms of the operations acting on the set of strings built out of the symbols. 
Seen in this way, computation seems to be detached from physical reality, and 
any 'system' that 'understands' the rules can perform the computation. 

From a practical point of view, the software /hardware division also stresses 
this apparent independence of physics. The software in the form of computer 
programs written in any of the many hundreds of invented programming lan- 
guages are again just strings of symbols. They seem to have no more connection 
to physics than the ink with which they are recorded on paper. When they are 
compiled and stored electronically, the link with physics is somewhat more pro- 
nounced but still weak. It is upon actually running the program, which always 
entails the motion of some physical system, that the physical nature of compu- 
tation comes into focus. This is obvious if the algorithm is carried out by hand 
or using some mechanical computing device. 

So there is a link, however weak, to some physical substratum, and it is not 
possible to severe this link completely. On the other hand, it is a fundamental 
property of reality that it is possible work and solve computational problems at 
abstract levels without having to check physical realizability at every step. This 
is analogous to the process of abstraction which is so characteristic of computer 

^An abstract stack machine is a notational system for giving step- by-step meaning to the 
primitives of a programming language. 
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science. By abstraction, ever more powerful and complicated computational 
tools can be invented, which, once it has been ascertained that they can be 
implemented in terms of more primitive structures, can be used to solve more 
difficult computational problems without checking the implementation at every 
step. 

But if we work the other way, from abstract levels of programming structures 
to more concrete primitives, then eventually we will arrive at some physical sys- 
tem, a computing machine, that actually performs the physical motion needed 
in order to carry out the computations. In digital computers this is switching 
voltage levels in transistors, which in its turn involves the collective motion of 
large numbers of electrons. 

Thus, as was pointed out and studied by Landauer information is always 
carried by some physical medium, and likewise computation is a physical process 
constituted by some well defined motion of a physical system. 

1.1 The theory of computation 

The theory of computation arose in the nineteen thirties as a response to prob- 
lems in the foundations of mathematics and logic , in particular in connection 
to David Hilbert's Entscheidungsproblem. The Entscheidungsproblem is a prob- 
lem within the formal or axiomatic approach to mathematics. Hilbert's program 
was to formalize mathematical theories into a set of axioms defining relations 
between the undefined primitive notions of the theory, and a set of rules of de- 
duction. In this way one should be able secure the foundations of mathematics 
as well as mechanize the process of theorem proving. Good properties, like con- 
sistency and completeness, should be possible to ascertain within the axiomatic 
system. 

The axiomatic approach itself has a long history dating back to antiquity. 
After the invention of the calculus by Newton and Leibniz in the mid seventeenth 
century, there was a very rapid progress in the fields of applied mathematics and 
physics. The new mathematics was phrased in an axiomatic language but the 
underlying concepts were intuitive and often vague. In the background history of 
Hilbert's approach we find attempts to secure the foundations of such concepts 
as infinitesimals, limits, real numbers, functions and derivatives to name a few. 
As an aside it is interesting to note the very close interplay between mathematics 
and physics during this period. Apart from being a theoretical subject of its 
own, mathematics is also the language of the physical sciences and of technology. 

Hilbert's formalistic approach to mathematics made a distinction between 
the syntactic aspects of mathematics, i.e. the axioms and the rules of deduction, 
and the semantic aspects, i.e. what the mathematical concepts and theorems 
actually mean. 

Physicist, engineers and applied mathematicians are normally interested in 
the meaning of mathematics. Phenomena in the real world, and whole areas of 
science, are modeled using mathematics. On the other hand, once the modeling 
is done, the actual calculations can be performed without considering the inter- 
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pretation. In practice there is always an intricate interaction between modeling, 
calculation and interpretation. But the point is clear, the strength of mathemat- 
ics derives from this division into syntactic rules of calculation and its semantic, 
or intuitive, interpretation in terms of objects in the physical world. 

The same interplay between syntax and semantics is, of course, present in 
computer science itself. We write computer programs in order to solve scientific, 
engineering, economic, administrative, everyday and entertainment problems. 
But the programs run on computers that perform purely syntactic symbol shuf- 
fling. In the theory of programming languages there is also this division between 
the syntactic and the semantic aspects of programming and program execution. 

1.2 The input/output model of physics and com- 
putation 

The paradigm of computation is capable of encompassing widely different sys- 
tems. On a certain level of abstraction, the description of a physical system and 
a computing machine is very similar, not to say identical. Almost everything 
conceivable can be described by an input function output model. A 
physical system is defined as some well defined portion of space and time with 
a well defined interface and interaction with the environment. Then specifying 
the input using some labeling, the output can, in principle at least, be computed 
using the dynamical laws. 

In computer science, we always know the dynamics of the system, because 
this is the program, and we set up the system in order for it to compute previ- 
ously unknown outputs from given inputs. Furthermore, essentially due to the 
discrete finite nature of input and output, there is a well agreed on paradigm 
for this input-processing-output model. As soon as the labeling (the alphabet) 
of the input/output states are defined, the computation is just a syntactically 
ruled shuffling of the labels. 

In physics the focus is different. First, we sometimes don't know the dynam- 
ical laws. The very object of fundamental physics is to investigate the dynamics 
through theory, experiments and observations. 

Is there a difference in the computational strength of different physical sys- 
tems? What algorithms can be performed with what physical systems? These 
are questions not normally posed in theoretical computer science where the dis- 
cussion is from the outset framed within the classical computational models, all 
of which are basically notational systems. 

However, it is generally believed that any physical system constrained to 
work in a discrete stepwise fashion, working to precisely and finitely stated 
rules according to the logical description of a Turing Machine or an electronic 
computer, or even a human computer as envisioned by Turing, are equivalent. 

The Church -Turing thesis identifies the set of effectively or intuitively calcu- 
lable functions with the set of functions computable within any of the classical 
computational models. In a historical context, effective computability meant 
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computability by an abstract human being working to precise rules. 

The thesis has however acquired connotations connecting it to machine com- 
putation, in particular electronic digital computing machines. In this latter 
sense the thesis is true; what can be computed by a general purpose digital 
computer can be computed by a Turing machine. This is due to the fact that 
a digital computer can be modeled as an abstract RAM machine. Whether 
any conceivable physical computing machine is constrained by the thesis is not 
known (but see [HH)- This is a question about all of physics, and we don't know 
all of physics yet. 

On the other hand, the precise mapping between models of computation and 
human and digital machine computation and the consequent possibility to study 
computation in the abstract has lead to the view that the limits of computation 
are set by mathematics and logic. The development of quantum computation 
has, to a degree, challenged this view of computation. Since computations are 
basically physical processes when actually carried out, it can be argued that 
what can be computed is a question of physics, not a question of mathematics 
or logic [TI- 
LS Classical physics and the computer 

The CPU of a digital electronic computer as well as the main memory used for 
intermediate storage consist of huge amounts of transistors and other semicon- 
ducting devices working in an on/off fashion corresponding to distinct voltage 
levels. These voltage levels constitutes a concrete realization of the abstract 
bit of information. The semiconductors in their turn are arranged into circuits 
implementing logical gates. The precise mapping between abstract logical gates 
working on bits and circuits working with voltage levels are the basis for the 
success of the electronic digital computer. But without the extreme fastness 
with which the switching between on and off can be performed (on the order 
of nanoseconds) the computer would not be so powerful. There is also an en- 
gineering aspect of this. The transistors must work in parallel, and in practice 
the CPU clock controls the working of the computer so that at each time tick, 
bit flips are performed in parallel. 

The electronic digital computer can therefore be seen as a special physical 
system constrained or engineered to work in a discrete way. All operations per- 
formed by the computer are discrete, but the underlying physical processes are 
continuous, or at least the description of these physical processes is continuous. 
The actually bit flips between zero and one, when studied at the physical level 
take a certain amount of time and the transition between voltage levels can 
actually be studied by employing a good enough oscilloscope. But once one 
has abstracted away from these physical considerations, the operations of the 
electronic computer could as well be performed by other physical systems, for 
example electric circuits working with magnetic relays. The performance would 
be much slower and other engineering problems would ensue. ^ 

■^Actually such computers, and computers based on vacuum tubes, preceded the solid state 
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The workings of a transistor, as far as electronics goes, can be described 
by classical electrodynamics. But electronics is not far from quantum physics. 
Transistors are quantum mechanical devices that could not possibly be under- 
stood or built without a knowledge of quantum physics. However these tran- 
sistors are wired to work in a discrete fashion as switches. Logically there is 
no difference between a transistor switch, a mechanical switch or an electrome- 
chanical switch. The only difference lies in performance measures like speed, 
reliability and energy consumption. The underlying physics of the transistor 
must be understood in terms of quantum mechanics, but once that is done, 
the transistor as a circuit element can be understood in classical terms. And 
furthermore, as pointed out above, these circuit elements can be abstractly 
modeled and reliably worked with without at every step consulting the under- 
lying physics. When implementing the circuit, design issues like power supply, 
switching times, delays et cetera must of course be faced, but this does not, in 
principle, influence the logical design of the circuit. 

In the end it all comes down to the fact that we can build fast electronic com- 
puters that can effectively carry out algorithms. These computers are based on 
quantum mechanical physical systems constrained to work in a discrete, classical 
fashion. Furthermore, miniaturization notwithstanding, these systems involve 
the collective behavior of large numbers of particles (electrons), thus relying on 
statistical properties of the systems. 

In contrast, in the would be quantum computers, it is the individual behavior 
of the particles we have to rely on. This is at the same time the source of the 
power of quantum computation, as well as the source of the engineering problems 
of actually building devices capable of enacting quantum algorithms. 

1.4 Quantum computation 

In 1980, the physicist Richard Feynman pointed out that digital computers can- 
not simulate quantum systems without an exponential slowdown W. Feynman 
wasn't particularly interested in approximating quantum physics, what he dis- 
cussed was exact simulation, the question of whether digital computers could do 
exactly the same as the quantum system would do. He came to the conclusion 
that present day physics does not allow this, essentially due to a mismatch be- 
tween the discrete nature of digital computers and the exponentially large state 
space of quantum systems. Feynman's interest in the physics of computers 
sparked off the research into quantum computation in the 1980's. 

Other lines of research that contributed to the initial impetus of quantum 
computation was the work of Bennet ^01 and Fredkin and Toffoli ^Jj on re- 
versible computation, as well as the previously cited work by Landauer . 

In quantum computing we are interested in the computational strength of 
physical systems that by their very nature must be analyzed or understood 
according to quantum mechanics. Quantum computation relies on the exact 
manipulation of individual quantum physical objects, in distinction from the 

electronics computer. 
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classical computer, where averaged classical statistical properties of the objects 
suffices. This is the source of the strength of quantum computation as well the 
difficulties in actually building quantum computing devices. There is to be no 
transition to the classical regime during the computation, as that would destroy 
the very features that lends the quantum computer its strength. 

Some researchers in the field remark that, as classical physics is fundamen- 
tally wrong, any correct theory of computability must be based on quantum 
mechanics. This point of view is very clearly stated in the paper by Deutsch 
where it is claimed that there is a physical assumption underlying the Church - 
Turing thesis. 

Deutsch argues that the Church -Turing thesis is actually at variance with 
classical physics, but that it can be rephrased in agreement with quantum 
physics. According to this point of view, quantum computers are not fun- 
damentally more powerful than Turing Machines, though they might be faster. 
The basic line of the argument is that the continuous nature of classical physics 
makes it in principle impossible to simulate a classical physical system by a dis- 
crete computer. But quantum physics is fundamentally discrete, and therefore 
the Church -Turing thesis connects effective methods not to classical computing 
machines but to quantum computing machines. The argument is not entirely 
convincing, and will be returned to in part II. 

And with this we plunge into the details! But first a disappointing remark is 
perhaps in order. There are no quantum computers as yet if one does not count 
experimental setups working on the equivalent of a few bits. Because of this, 
not much will be said here about realizations. Anything written about practical 
implementations of quantum computing devices will surely soon be outdated 
by new experimental developments, whereas the theoretical part of the topic is 
likely to be more stable. 
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Chapter 2 

Classical computation 



Quantum computation relies on the ability for quantum mechanical physical 
systems to perform computations. In order to prepare some common ground 
for the discussion, we will review the theory of classical computation in terms of 
Turing machines, the Church- Turing thesis and the limitations of computation. 

There are two fundamental questions in the theory of computation; what 
can be computed in principle, and what can be computed efficiently. The first 
question is addressed by the theory of computability and second by the theory 
of complexity. In order to discuss these questions in general without reference 
to any particular computing machine or programming language, one must work 
within some abstract mathematical model of computation. Still, it must be pos- 
sible understand precisely the relationship between the model and the properties 
of actual physical computing machines. 

The models of computation developed in the nineteen thirties were all at- 
tempts to capture in precise mathematical terms what is meant by a compu- 
tation. In order to answer the Entscheidungsproblem^ , i.e. David Hilbert's 
question of whether there exist a mechanical method to determine if mathemat- 
ical statements are true or not,^ it was necessary to have a precise definition of 
a mechanical method in order to treat the question with mathematical tools. In 
this context, mechanical does not necessarily mean, and in fact did not mean, 
a procedure performed by a machine. Mechanical means algorithmic. 

As it turned out, the different models put forward; Church's A-calculus, 
Herbrand-Godel's recursive functions and Turing's automatic computing 
machines were all shown to be equivalent in the sense that they all defined the 
same set of computable functions (see several papers in JSI)- All three models 
were meant to be abstract mathematical models of computation. Turing, how- 
ever, phrased his concepts in terms of machines reading and writing symbols 

^This was apparently not the only impetus to this work, see 

■^For Hilbert, truth of a statement was equivalent to it being a theorem, otherwise one would 
have to distinguish between truth and derivability when stating the Entscheidungsproblem. 
John von Neumann discussed the problem in terms of provability. 

^The term is Turing's own. 
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on a tape, and compared the process of computation to humans working with 
paper and pencil. And it is also with the Turing model that the connection to 
present day digital computers and to physics is most clearly seen. 

The formal models of computation must be connected to the intuitive notion 
of an algorithm. The models are meant to capture what it means to carry out 
a mechanical, or algorithmic, procedure. 

The Church- Turing thesis is often quoted in this context, but there seem 
to be some confusion as to what it actually says. After 50 years of dramatic 
evolution of digital computers, the thesis has perhaps not surprisingly, acquired 
connotations or meanings not present in the original formulation |12| . We will 
be concerned with what the thesis did say when it was first formulated, what is 
normally meant by it nowadays, and how it relates to quantum computation. 

2.1 Some definitions 

The following definitions are for the benefit of the reader not familiar with 
computer science terminology, and for fixing the sense in which the terms will 
be used in this text. 

2.1.1 Algorithm 

It is not possible to give a formal definition of the concept of an algorithm but 
it can be characterized well enough so that no ambiguity remains as to the 
meaning of the term. This remark is true for all the concepts treated in this 
section. Perhaps characterization is a better term to use. An algorithm consists 
of a set of instructions for carrying out a certain task. In computer science the 
task is a computation, a notion that will be defined below. The concept can, 
and must be, further elaborated by the following clauses. 

• The set of instructions should be precise and unambiguous. The number 
of instructions should be finite and each instruction should be finite in 
length. 

• A machine or a human can execute it. 

• There should be no room for subjective decisions, appeal to human intel- 
ligence or creative intervention of the user. 

• It should solve some general problem. 

• It need not be phrased in any particular language, programming or natu- 
ral. 

The first three clauses imply that all creative or intelligent effort goes into 
the task of finding or constructing the algorithm. Once the algorithm is known, 
it should be possible to carry it out automatically or mechanically. The fourth 
clause has to do with the fact that we are not in general interested in particular 
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cases, rather we want so solve sets of problems, often parameterized by a col- 
lection of variables. Therefore a general algorithm has a domain of definition, 
which is the set of meaningful, or allowed, input values or instances. The last 
point means that algorithms have an abstract existence independent of any par- 
ticular language. In practice, an exact programming language or pseudo-code 
language is useful in order to satisfy the first three clauses. The term mechanical 
method or effective method can be considered to be synonymous with algorithm. 
The word procedure, can be used instead of method. Sometimes, the word gen- 
eral will be used to emphasize that we are considering methods applicable to a 
range or set of problems. 

The computational models of the thirties identified this informal notion of 
an algorithm with precisely defined formal models of computation. 

Note that we do not include termination among the characteristics of algo- 
rithms. That would be inappropriate for two different reasons. Firstly, many 
algorithms are not meant to terminate, at least not before we actively choose to 
terminate them. Examples are operating systems, web servers and lots of ev- 
eryday applications like word processors. Secondly, termination is a non-trivial 
issue that has to do with executing, or running, the algorithm. This will be 
discussed in the next section. 

2.1.2 Computation 

By a computation we mean the actual carrying out of an algorithm. From this 
follows that computations are processes taking place in time, that can be carried 
out by either machine, human or any other suitable physical system. The only 
requirement is that the computing system 'understands' the language used to 
write the algorithm in, and thus is able to carry out the instructions. 

This distinction between an algorithm as a passive description of a compu- 
tation and a computation as an actual enacting of an algorithm is not always 
upheld. The terms are often used interchangeably. In practical work with com- 
puters this does not lead to any confusion but when discussing fundamental 
questions of principle it is helpful to maintain this distinction. 

When it comes to quantum computation and quantum algorithms the dis- 
tinction is somewhat more acute. At the present time there are no quantum 
computers, so there is nothing to run the quantum algorithms on. Furthermore, 
it is not practical to simulate quantum computations on classical computers, as 
the time evolution of a quantum mechanical system that is inherent in quantum 
computation requires exponential resources! 

If the above characterization of an algorithm is applied to a human perform- 
ing a computation, the question can be asked as to what are the limitations 
of algorithms or computation. What can be calculated effectively, or mechani- 
cally, is precisely what c;an be; done by following an algorithm with the additional 
clause that the algorithm should always produce the desired result in a finite 
number of steps. This question, whether the algorithm terminates or not, turns 
out to be a nontrivial issue as already noted. 



15 



2.1.3 Program 



A program is an algorithm written in a certain language. The term program is 
used in two slightly different, but related senses. 

In the first sense, we are referring to a program written in a general purpose 
programming language. Such a program should be possible to run on the appro- 
priate computing machine without further work, except possible compilation. 
Hence the program must contain all circumstantial information like include or 
import statements for supporting files and software. The program furthermore 
should handle input and output of data, either in an interactive way through 
standard input and output devices or via a file system. A program is therefore 
a practical embodiment of an abstract algorithm. 

In the second sense, the term program is used for a collection of instructions 
for a computation in an abstract computational model like Turing machines. In 
this case, there need not be a physical machine to carry out the computation. 
But it should be possibly to carry it out (by a human) by adhering to the rules 
and specifications of the computational model. 

In some cases the instructions might be ordered in a list. In that case, we 
consider the execution order to be given by the ordering of the instructions in 
the list, possibly with branching of to labels in the list.^ 

When the program is actually a set of instructions, no execution ordering is 
presupposed. The machine looks up the proper instruction to execute depending 
on the state of the machine and the data. This is the way a Turing machine 
computes. 

2.1.4 Process 

By a process we mean a program in execution. In some contexts, notably 
operating systems, the word process is reserved for executing programs that are 
not meant to terminate. In the present context we are primarily interested in 
terminating processes and I will use the term in both senses, letting the context 
determine which meaning is referred to. 

Thus computation and process emphasizes the physical and dynamical side 
of our topic, whereas algorithm and program emphasizes the logical and math- 
ematical side. 

2.1.5 Alphabets, Strings and Numbers 

Numbers and strings of symbols are fundamental to computer science. Just 
as a computation can be thought of as the calculation of numerical values of 
a function, it can as well be regarded as the processing of strings of symbols. 
The equivalence follows from the fact that any finite set of finite strings of 
symbols drawn from a finite alphabet can be put in a one-to-one correspondence 

^This is necessary in order to implement if < condition > then < statement > else 
<statement>, and while <c<mdition> do <statement> programming primitives. 



16 



with a subset of the natural numbers. Any enumeration of the strings in some 
lexicographic order will do. We will make these concepts somewhat more precise. 



Numbers 

By a number we mean, unless otherwise stated, a natural number, i.e. a member 
of the infinite set N — {0, 1,2,.. .}. This set can be defined inductively start- 
ing from the number and adding, in a step by step fashion, the successors. 
Informally, 



is a natural number. 

If n is a natural number, then the succesor n + 1 is a natural number. 



This is not really a good definition, since in writing n + 1 for the successor of n 
we are in fact presupposing the numbers together with addition. But it captures 
the idea behind the following more formal definition. 

The set N of natural numbers are defined by the clauses 



where S{n) denotes the successor of n. 

Arithmetical operations, like addition and multiplication can be defined on 
this basis |14| . 

By N'' we denote the set of all d-tuples (ni, n2, . . . , nd) of numbers. 
Strings and languages 

The concept of a symbol, or a token, will be taken to be intuitively given and 
not further analyzed. An alphabet is non-empty a set of symbols, generically 
denoted by S. A string (over an alphabet) results when the symbols taken 
from an alphabet are written consecutively. The order in which the symbols 
are written within the string matters. There must be no blanks or commas or 
other separators between the symbols in a string, instead, if blanks or other 
separators are needed they should be included among the symbols. A language 
is a set of strings. The following notation is useful to have at hand. 

The set of all strings over the alphabet S is denoted by S* and it includes 
the empty string £. An equivalent term for string is word. E"*" denotes the set 
of strings with the empty string excluded. In some contexts, strings will be 
enclosed by " " , as is prevalent in computer programming languages. Now, a 
language is an arbitrary subset of S* . This is all we will need from the general 
theory of formal languages (cf. \lb\). 

Bit strings 

One particular important kind of strings are the bit strings. They are based 
on the alphabet {0, 1}. A bit string is any combination of the symbols and 1 
written without any blank separators. By the notation {0, 1}'^ we mean a bit 




(2.1) 
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string of length k which will also be more explicitly denoted by "6162 • • • 
Such a string is naturally interpreted as a k-bit binary number. Then the 
corresponding base- 10 representation of the binary number can be used as a 
shorthand for the string, as in the example "0101" = 5. 

2.1.6 Functions 

The concept of a function is supposed to be well-known, but we record the 
basic definition here for completeness and fixing our notation. A function can 

be regarded as a mapping between two sets X and Y , taking an clement x in 
X and mapping it into a unique element y in Y. This is formally written as 
f : X Y. When we want to focus on generic, or particular elements that are 
connected by the mapping, we write y ~ f{x) or = f{o)- The element 6 in K 
is called the image (under the mapping /) of the element x in X. 

A more formal definition of functions is based on the concept of a relation. 

A relation is a subset of all ordered pairs [x, y) where x ^ X and x eY . A 
function / is a special kind of relation for which wo require: if [x, y) G /' and 
(x, z) G /, then y = z. This condition expresses the uniqueness requirement, 
that for each x in X there is a unique y in Y . This is the only restriction on a 
function, and consequently, the concept of a function is very general indeed."'' 

If the function is defined for all elements in X, it is a total function, otherwise 
it is a partial function. The elements of X for which the function is defined is 
called the domain of definition. The elements of Y which are images of elements 
of X are called the range. 

We will almost exclusively consider functions from A'"'' to N , from {0, 1}'' to 
{0, 1}', or from E* to E*. 

Enumerations 

A function is said to be one-to-one if the image of two different elements in the 
domain are different. If every element in the set Y is the image of an element in 
the set X (i.e. the range of the function is the whole set Y) then the function 
is said to be onto. 

Furthermore, a total function which is both one-to-one and onto is called a 
one-to-one correspondence. Such functions are useful for comparing the numbers 
of elements in two sets, as they as the name suggests, sets up a one-to-one 
correspondence between the sets. If the one-to-one correspondence is between 
a set X and a subset of the natural numbers, then it can be used to count, or 
enumerate, the elements in X. Such an enumeration also provides an ordering 
of the elements in the set, as the following paragraph makes explicit in the case 
of strings. 

The number of elements in a set is called its cardinality. A finite set has a 
cardinality which is a natural number. An infinite set, the elements of which 

^As we will see, when the sets X and Y are infinite, just a denumerable subset of all 
possible functions according to this definition are a<;tually possible to compute by algorithmic 
methods. 
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can be put in a onc-to-onc correspondence with the set N of natural numbers, 
is said to have cardinality Hq. 

Lexicographic ordering 

It is often useful to bo able to order strings in a lexicographic order. The defini- 
tion is mimicked on the ordering of words in the English language. An example 
makes the concept clear. Suppose E = {5*1, S'2, ^s}, then the lexicographic 
ordering of the strings is the infinite list 

[£, Si, S2, S3, SiSi, S1S2, S1S3, S2S1, S2S2, S2S3, . . .]. 
We can define a function lex from set of all strings S* to the set of integers 

N, 

lex : S* ^ TV, 
where in particular lex{£) = 0. 

Clearly, the lexicographic ordering provides an enumeration of the strings in 
the language. 

2.1.7 Decision procedures and Computation procedures 

It is sometimes useful to distinguish between algorithms for computing values of 
functions and algorithms for yes/no decisions. In the first case, the object of the 
algorithm is to compute values for functions / : A*''^ — » N . Such an algorithm 
can be called a computation procedure. Since it is possible to set up one-to-one 
correspondences between natural numbers and strings, computation procedures 
can also be viewed as string processing; an input string w is processed into an 
output string f{w). In this case we are considering functions / : S* — > E*. But 
it is often natural to think of computation procedures as computing values of 
numerical functions. 

In the second case, the object is to decide questions like for example; Is the 
number n prime? Does the number a divide the number hi Such questions define 
properties P{n), binary relations R(a,b), or in the general case, n-ary relations 
R{ai, . . . , an)- Algorithms for such questions are called decision procedures. 

Decision procedures are often formulated in terms of languages. A language 
is a set of strings constructed in some way or satisfying certain properties. The 
problem is to determine for an arbitrary string over the alphabet whether it 
belongs to the language or not. This is a typical decision problem, having a 
yes or no answer. In fact, all decision problems can be formulated in terms of 
language membership. We will return to these notions in section 2.3.4. 
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2.2 A note on the connection to everyday com- 
puting 

The contents of this chapter might seem remote from everyday computers and 
their uses, and it is perhaps interesting to make a comment on the connection. 
As aheady remarked, many algorithms are not meant to terminate. We could 
take a word processor as an example. A word processor is a complicated piece of 
software, which apart from presenting the user with a graphical interface, also 
should respond to input from the keyboard as well as files read from storage 
media. Such programs are said to be event-driven. This means that they only 
perform actions when called for by request from the user, otherwise they are 
idle. (There might be actions like rewriting the screen invoked by other parallel 
running processes.) When the user hits a key, this triggers a computation that 
results in the text stored in the memory and displayed on the screen being 
altered. This can in fact be seen as a computation of a function, or better still, 
as string processing. The input is the present text stored (the coded text really) 
together with the code for the key. The output is the new text. Depending 
on what action is requested, different computations are performed on the text. 
Thus the concept of a computation is very strong and in fact incorporates all 
types of data processing made by digital computers. 

2.3 The classical Turing machine model of com- 
putation 

In a mathematical oriented approach to the theory of computability the dis- 
tinction between passive algorithm and active computation can easily be over- 
looked. Just one example will suffice to make the point clear. Most popular 
or semi-popular accounts of Turing Machines abound with animistic phrases 
like reading, writing, moving etc. This conjures up the image of a magnetic 
read-and-write head moving along the tape, erasing and writing information. 
Exact mathematical notions can replace this suggestive but imprecise termi- 
nology, one classic example of which is iB^ . On the other hand, upon reading 
such a mathematical account of computability, it seems that the mathematical 
formulation has got rid of all reference to motion or time steps. Close scrutiny 
however reveals the distinction between algorithm and computation even in this 
case. The Turing machine program is the set of instructions for the machine 
together with specification how to present the machine with data and how to 
read off data. This is clearly passive. However in order to actually perform the 
computation inherent in the set of instructions, someone, machine or human, 
has to perform the computational steps. 
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2.3.1 Informal description of Turing machines 

A very good description of Turing machines can be found i Turing's original 
paper |17| . In fact, most modern informal descriptions are just rephrasings of 
Turing's own words. This is true also for what follows here. 

The machine consists of a memory, a read-and-write head and a processing 
unit. 

The memory is a tape which is divided into distinct squares, also called cells. 
It is infinite to the left and to the right^. The memory tape is used for giving 
input to the machine, for storing intermediate data during computation and for 
writing output. 

The read-and-write head can move along the tape. It can read symbols 
written on the tape (this is called scanning) and it can write symbols on the 
tape. 

The symbols can be any symbols, but they must come from a finite alphabet 
r = {5*0, Si, ... , Sn}- 

The machine has a finite set of elementary operations that it can perform at 
each step in the computation. These are 

• move one step to the right 

• move one step to the left 

• write a symbol 

• erase a symbol 

• halt 

This can symbolically be written as a set of operations 

O = {moveright, moveleft, write{Si), erase, halt} 

Note that reading a symbol need not be considered to be an operation. In 
fact, the machine always reads the symbol written on the scanned square. In 
some formulations, the operation erase is replaced by writing a special symbol 
called a blank, i.e. by the operation write{blank) . 

The halt operation can be implemented in different ways. 

The machine is controlled by a set of instructions. This is the program. 
In order to distinguish the instructions, the machine is considered to be in a 
set of different machine states. The states are numbered or given symbolic 
names from a set Q = {qo,qi, ...,qn}. Each instruction consists of four symbols 
(present state, scanned symbol on tape, operation, new state) or (qi, S,op,qj) 
where qi & Q, S & Y:, op G O, qj € Q. 

The program is executed by a control unit. Execution starts in a special 
initial machine state go scanning the leftmost symbol on the tape. At the 

^For practical purposes, if one would like to write a computer program emulating a Turing 
machine, it might be easier to consider a one-way tape with a start square to left and infinite 
to the right. 
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beginning of the computation, all but a finite number of tape squares are blank. 
This is true throughout the computation. At each step of the execution, the 
control unit checks through the list of instructions to find an instruction that 
matches the present state of the machine and the scanned symbol. Each cycle 
of the execution therefore consists of the following actions: 

• get present state Qpresent 

• get scanned symbol Uscanned 

• find matching instruction {qpresent,ascanned,Op,qnew) 

• execute the instruction op 

• change to the new state qnew as given in the matching instruction 
Some comments 

Infinite memory in the form of an infinite tape is of course impossible in reality. 
But this is not a problem. At any stage of the computation, only a finite set of 
squares is needed. Should the machine ever run out of tape, a finite amount of 
new tape can always be added to the right or to the left. The number of squares 
on the tape thus need not be actually infinite, only potentially infinite. 

As regards the symbols, the simplest choice is {0. 1} where '0' serves the 
purpose of a blank. Numbers are coded as strings of '1' separated by '0'. The 
number itself must be coded as '1' in order to distinguish it from a blank, 
and consequently 1 is coded as "11" and so on. If one wants to use a more 
efficient binary coding, one can use an alphabet consisting of '0', '1' and a blank 
separator 

In some formulations of Turing machines, the operations of writing and 
moving are combined into a single operation. In that case an instruction consists 
of five symbols QiSkSiQjM , where M denotes a move. 

Furthermore, a special halt instruction is not needed. The machine stops or 
halts when the control unit cannot find any matching instruction. In practice, 
though, it is convenient to include an explicit halt instruction. In fact, when 
discussing decision problems in terms of Turing machines, it is natural to have 
two halting states, for example named by yes and no. 

The names of the states are arbitrary, they can be named in any way that 
serves the purpose of clarity. 

In the next section, a formal definition of a Turing machine is given. It 
does not entirely conform to the informal description given above. The reader 
unfamiliar with Turing machines might benefit from comparing the details. 

2.3.2 Formal definition of a Turing Machine Model 

There are lots of variations of the basic definitions of a Turing machine in 
the literature, differing in details and level of formalization. I will choose an 
approach that is quite formal in order to remove as much of physics that is 
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possible, following See also ^Sl for a modern treatment. The parts of the 
definition arc all motivated by the intended semantics of Turing machines. 

Program 

Consider an alphabet Am consisting of the following tokens 

1. a finite non-empty set of symbols E = {Si, S2, ■ ■ ■ , Sn}, not containing 
the special tape blank # or any other tape markers. E is called the input 
alphabet. 

2. a finite non-empty set of tape symbols F, one of which is the tape blank 
5o = U used to mark tape ends. F also contains any other tape markers 
such as # used to separate the input into tuples. Note that E C F. 

3. a finite non-empty set of internal configurations Q = {qi, q2, . . . , qn}, also 
called machine states. 

4. a (small) finite set of halting configurations Qh, where Qh H Q — 

5. a set of moves M = {L, R} ^ 

One of the machine states qo is singled out as the start state. It is also 
convenient to single out halt states. If the machine is programmed for compu- 
tation problems, one halt state, qh, suffices. If the machine is used for decision 
problems, two halt states {qy,qn} corresponding to the answers yes or no, are 
singled out. Note that the halting states are not in Q. 

An expression is a finite sequence of tokens chosen from Am- An instruction 
is an expression having one of the following forms 

• qiSkSiq-jR 

• qiSkSiqjL 

The intuition is that, if the machine is in the configuration scanning the 
symbol Sk, it prints the symbol Si, changes configuration to qj and it makes a 
move R or L. 

A Turing machine M has a program Pm that is a finite non-empty set of 
instructions. The program can be thought of as defining a transition function 

(5 : g X F ^ (Q U Qh) xTx M. (2.2) 

This definition makes explicit that there are no transitions from the halting 
configurations. 

As an example of the correspondence between instructions and the transi- 
tion function, note that the instruction qiSkSiqjR corresponds to S{qi,Sk) = 
{qj,Si, R). If the transition function is undefined for a certain qiSk then there 
simply is no instruction in the program Pm with the first two symbols equal 

^Sometimes it is useful to include a "no move" S (Stay). 
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to QiSk- In that case the machine gets stuck. This should be considered as a 
programming error, unless the configuration gj is not one of the halting config- 
urations. 

No two instructions have the first two symbols QiSk the same. This means 
that at each step of the computation, the action of the machine is uniquely 
determined. Therefore, what we have defined so far are deterministic Turing 
machines. Removing this restriction leads to the classes of non-deterministic, 
probabilistic and quantum Turing machines respectively. These will be consid- 
ered in sections 2.7, 2.8 and 6.2. 

To prepare the way for this generalization, the transition function can be 
defined in a different way that will be useful when discussing generalizations of 
the Turing machine concept. Define the function A 

A: Q xT X (QuQh) xT X M ^ {0,1}. (2.3) 

Clearly, this is a function from instructions to the set {0, 1}. For all instruc- 
tions in the program Pm, A evaluates to 1. Otherwise it evaluates to (i.e. the 
instruction is not contained in the program). 

This can be formalized somewhat further. Consider the set / of all possible 
instructions 

/ = X r X (Q U Q,,) X r X M, (2.4) 

This is a finite set and a program Pm is a subset of this set, or Pm C I.^ 
The function A can therefore be written as 



A:/^{0,l}whereA(^) = |;; . (2.5) 

Note that in this way A is naturally defined as a total function on the finite 
set of instructions /. 



Data 

A tape expression is an expression consisting entirely of symbols from the set 
r. Denoting generic tape expressions by calligraphic letters, for example T, we 
can consider tape expressions as split in left jC and right TZ parts, and T = CTZ. 
By convention, we always mark tape ends by the symbol = U, so that the tape 
actually looks as ULTZU. 

Executing Turing Machine 

An instantaneous description a of TM is an expression satisfying the following 
requirements 

*The power set 'P(/) of / is the set of all Turing machine programs for the given alphabet 
and configuration set. So we could also write Pm G 
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1. it contains exactly one qi 

2. this Qi is not the rightmost token 

3. it does not contain R or L 

4. for all the symbols Sfe in a , 5fe e F. 

The state Qi is called the instantaneous internal configuration at a. The 
symbol 5*^ immediately to the right of qi in a is called the scanned tape symbol. 
In practice, an instantaneous description is a tape expression with exactly one 
configuration symbol qi inserted directly to the left of the scanned symbol. 
Since there must always be a scanned symbol, qi cannot be the rightmost token. 
Formally, a general instantaneous description can be written as LqiTZ where the 
right tape expression TZ must be non-empty. 

So far everything is static. In order for the Turing machine to actually 
perform a computation we need a computation relation or a set of rewrite rules 
a — > /?, allowing us to pass from one instantaneous description to another. In 
the following, X and y denotes tape expressions, possibly just strings of blanks. 

The computation relation a ^ /3 is defined by 

1. If a = XqtSky and qiSkSiq^R e Pm then /3 = XSiq^y 
1 1 print and move right 

2. If a = XqiS^ and qiSoSiqjR £ Pm then /3 = XSiqjSo 
//print and right move at right end of tape, insert blank 

3. If a = XSmqzSky and qiSkSiqjL G Pm then /? = Xq^SmSiy 
//print and move left 

4. If a = qiSoy and qiSoSiqjL e Pm then /3 = qjSoSiy. 
//print and left move at left end of tape, insert blank 

In practice, the rewrite rules are applied by searching for an instruction 
having the first two symbols matching the machine state and the scanned tape 
symbol. Then, at each step in the computation, the machine scans the symbol 
on the tape, prints a new symbol and performs a move according to the rules. 

An instantaneous description is terminal if none of the rewrite rules apply. 

A computation is a finite sequence ai, a2, ap of instantaneous descriptions 
such that a, — > q;,+i for 1 < i < p and such that Up is terminal. The result of 
the computation is written M(ai) which we define as ap. Using the notation 
— >* to denote a computation in several steps, we have 

ai ^* ap = M(ai). (2.6) 

This is the formal definition. Some comments are obviously in order. The 
question of whether the Turing machine halts or not is the same as whether 
there exists a computation or not. The tape is considered to be potentially 
infinite. This is taken care of by the computation rules (2) and (4) which have 
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the effect of inserting blank squares at the ends of the tape when the machine 
is about to run off the tape. In any computation, only a finite amount of tape 
is ever used. 

In order to get something done with this model, a few more choices must be 
made. We need a way to represent input data as tape expressions and a way to 
read off output from the terminal description. Some convention is needed for 
how to start the computation and what should be considered a proper terminal 
state. 

A note on terminology 

What I have denoted by the term internal configuration is often denoted by 
the term state in the literature on classical Turing machines. My terminology 
instead follows that of 16 , who uses the term internal configuration. It is more 
appropriate in the present context where we subsequently want to consider 
quantum Turing machines. There, we want to reserve the word state to denote 
the quantum state made up of the internal configuration of the machine together 
with the tape expression. This is what I (again following Davis) denote by 
instantaneous description. Thus, we define the state of a Turing machine to be 
synonymous to the instantaneous description. It seems reasonable in the present 
context to let the quantum physics usage of the word state to take precedence. 

To summarize, the term state is equivalent to the term instantaneous de- 
scription. When the set Q is refereed to, I use the terms (internal) configura- 
tion and machine state as synonyms. So the qualifiers internal and machine are 
equivalent. 

Furthermore, to have some connection with intuition, we can think of the 
tape as the memory, the contents of which is the tape expression. Then it 
makes sense to think of the set of internal configurations as the processor.^ The 
scanned tape symbol can likewise be marked by a cursor. 

Representing numeric input and output data 

Suppose we want to compute numerical functions / : N'^ — > N. The simplest 
choice is to use the a one symbol alphabet with 5*1 = 1 and a unary representa- 
tion of numbers. Since we need to distinguish the number from a blank, we 
let be represented by 1, 1 by 11, 2 by 111 etc. Sets of numbers are represented 
as unary numbers separated by the separator So a pair (3, 5) is represented 
by the tape expression 1111^111111. The generalization to n-tuples is obvious. 
The following notation is convenient. 

Let n = 11^^- Then the d-tuple (ni,n2, ...Tnd) is represented by the tape 

n+l 

expression 



(ni, 712, ■■■,nd) = "-i#"-2# • • ■ (2.7) 
It is actually a finite state machine. 
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In order to start the computation according to the definition; an initial 
instantaneous description must be given. We set 

ai = qi{ni,n2,...,nd). (2.8) 

The numeric result of the computation should be read off from the terminal 
configuration. The only available way to do this is to remove the single g,, upon 
which we get a tape expression, which must be interpreted as a number. A 
simple interpretation is to count the number of occurrences of 1, neglecting #. 
Another choice, more restrictive, is to demand that the terminal state consists 
of a single consecutive stretch of I's on an otherwise blank tape. 

The question of which terminal states should count as yielding acceptable 
output is really a question of how to code output data, but it affects the way 
programs for the machine are written. A choice often made is to demand that 
the machine should halt scanning the leftmost symbol on an otherwise blank 
tape. Then one has to add instructions to clean up the tape after the compu- 
tation proper is finished and then move left to the leftmost symbol. Whether 
this is worthwhile is a matter of taste. Formally, this choice of output coding 
corresponds to a terminal state of the form ap = qhU. 

This means that there is no instruction having the first two tokens g/jl . 
Thus Qh is the halting state (or one of the halting states). I will call this a 
standard terminal configuration.^'^ 

Let us finally connect these, admittedly a bit heavy-handed notations, to 
functions by explicit identifying computations and functions. 

We associate a function /m : iV — > N with a Turing machine M in the 
following way. 

For each d-tuple (rii, n2, n^) we set the initial state ai = qi{ni,n2, ...,nd). 

(a) If there exists a computation ai,a2, ap such that 

M{qi{ni,n2,...,nd)) = qwn 

then 

/m("i,"-2, ■■■,nd) = n 

(b) If no computation exists then fM{ni, n2, no) is undefined. 

More efficient numeric input /output conventions 

The unary description of numeric data is highly inefficient. It takes an expo- 
nential amount of tape space to represent a number as compared to a binary 
representation. Using n bits, which can be written on n tape cells, numbers 
ranging from to 2" — 1 can be represented, giving a logarithmic decrease of 
space requirements. It is convenient to use an alphabet {0, 1, ^} with an explicit 

^"For practical programming purposes, one can note that three situations can be envisioned; 
(1) the computation does not terminate (halt) and no output data results, (2) the computation 
terminates in a standard configuration and (3) the computation terminates in a non-standard 
configuration. It seems that allowing this last case to define output data, though not wrong in 
principle, is a bit risky in practice as one has less control over the workings of the computation. 
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blank symbol used to separate the numbers and write numbers in binary 
notation. 

Leaving the actual encoding of numbers open, (unary, binary or in any other 

base), wc can use the same notation as in the proceeding section. Let n be the 
code for the number n and (ni,n2, ■■■,nd) the code for the d-tuple of numbers 
(ni,n2, ...,nd), then 

/m("i,«2, ■■■,nd) = M{qi{ni,n2, ...,nd), 

if the computation exists, otherwise the function is undefined. 

It is often convenient to simplify the notation somewhat and simply write 
M(ni, 712, rid) for M{qi{ni,n2, ■■■,nd), so that fuix) and M{x) denotes the 
same object. 

The blank symbol confusion 

There seems to some confusion in the literature as to how to treat the blank 
symbol. In formal language theory, if one wants to have strings with blanks 
in them, the obvious way is simply to include a special blank symbol among 
the symbols. Since a string is always finite, there is no need to designate the 
beginning and end of a string in any special way. In particular, starting a string 
with blanks, or ending it with blanks, makes no sense. Such blanks would be 
trimmed away. 

However, in Turing machine theory, the tape is potentially infinite, and the 
machine needs some way to know where the right and left ends of the actually 
used part of the tape is. The obvious way would be to use the blank symbol as 
such a designator. This is often phrased as, a string written on an otherwise 
blank tape But then, if the blank is in the alphabet, and the string written on 
the tape contains blanks, the machine will not know whether a blank designates 
the end of the actually used tape, or if it is a blank within the string (as in the 
case where the blank separates the numbers rii in a d-tuple input). One way 
around this dilemma is to use two consecutive blanks. UU, to designate tape 
ends. In that case, the languages defined over the alphabet, needs to exclude 
strings with two consecutive blanks, otherwise, the confusion remains. Thus, 
when languages L over the alphabet E is mentioned, it is understood that no 
words in the language contain two consecutive blanks. 

Another way is to include two different 'blank' symbols, for example {U, #}, 
one =ff denoting 'string blanks' or input separators, the other U designating the 
'left' and 'right' ends of the tape. This is the convention used in the present 
work. Any other 'language' blanks play no role in defining the general model of 
Turing machines and need only be defined in specific examples. 

String processing 

The model is, of course, not restricted to computing numeric functions. In 
general, a Turing machine, performs string processing, taking an input string 
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Wi from the set of strings F*, producing an output string Wo if it halts on the 
input. More formally, the Turing machine M defines a partial function 

/m : r* ^ r* (2.9) 

where 

fMiw,)^Wo€T* (2.10) 
if AI halts on input Wi, undefined otherwise. 

The state graph 

Since the set of tape expressions is infinite, the space of instantaneous descrip- 
tions, or states, is also infinite. A computation can be viewed as a directed graph 
in this space. This graph will be denoted with Gc = {Sy,Te) where Sy {state 
vertices) denotes the set of vertices corresponding to states in the computation, 
and Te {transition edges) denotes the set of edges corresponding to transitions, 
i.e. computational steps. Two vertices Vi and Vj are connected by an edge if 
there is a corresponding instruction in the program, taking the machine from 
state Vi to state Vj. 

Note that for a deterministic Turing machine, the state graph is simply a 
path in the state space. For a computation that halts, i.e. a computation that 
starts in a certain state and ends in another state, the path is non-intersecting. 
This can be understood as follows. If the path intersected itself, so that the 
machine returned to an 'earlier' state, then the machine would enter an infinite 
loop, and would not halt. Therefore, terminating computations corresponds to 
linear paths. 

Concluding the formal definition of a Turing machine 

When defined in this way, everything looks static. Where does motion enter? 
Well, the computation, i.e. the series of instantaneous descriptions must be 
computed, at least once for each input d-tuple. Someone or something has to 
do this, human or machine. This is where motion enters. This is obvious if one 
considers doing the calculation with pen and paper. 

Also note that it is sometimes convenient to work with Turing machines with 
several tapes with concomitant read/write heads. 

Apart from the question of actually performing the computations, this is 
a formal theory of computation. There are many other models of computa- 
tion. Off the formal ones, we have the (Herbrand-Godel) recursive functions. 
Church's A-calculus, both contemporary with the Turing model. The RAM 
(Random Access Machine) model is close to an actual computer. Then there 
are lots of simplified programming languages, containing just the bare minimum 
of constructions. A survey of computational models can be found in |22I- 
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2.3.3 SyntELX and semantics 



It is interesting in this context to digress slightly and discuss the question of 
syntax vs semantics for this model. The Turing machine model reduces com- 
putation to syntax. Everything written above could easily be phrased in terms 
of a specification of a formal language. No meaning is conferred to the ele- 
ments of the model. One does not need to understand the tape expressions or 
the instructions in order to carry them out. An executing Turing machine just 
performs meaningless string processing. 

The grammar of the language is the specification of what is well-formed 
programs (sets of instructions) and what constitutes well- formed instantaneous 
descriptions, in particular the initial description. The computational rules, also 
syntactically defined, then tells us how to perform computations within the 
model. The data itself has no meaning, it is just a string of symbols. 

The semantics of the model enters through the interpretation of the initial 
and final tape expressions as defining sets of natural numbers, or objects from 
some other set of mathematical objects, functionally related via the computa- 
tion. Thus, we can say that we understand a Turing machine if we understand 
what it computes. 

2.3.4 Decision procedures and Computation procedures 
revisited 

We will now refine our notions about algorithms for decisions and computations 
respectively. 

Deciding recursive languages 

Let L C r* be a language, i.e. a set of strings defined over the alphabet T. 
Next, let M be a Turing machine and a; e F* be an input string. We say that 
M decides the language L if the following conditions hold 



Here we use the notation M{x) >- Qy to denote the sentence: "The machine 
M on input string x halts in the configuration qy\ and similarly in the other 
case. 

If the language L is decided by some Turing machine M, then L is a recursive 
language. 

There is also a weaker form of decision procedures. 

Recognizing recursively enumerable languages 

We say that M recognizes the language L if the following conditions hold 



{ 



M{x) )^ Qy if a; G L 
M{x) y Qn if a; ^ L ■ 



(2.11) 



f M{x) 
\m{x) 



y Qy if a; e L 
;^ i> if a; ^ L 



(2.12) 
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where by M{x) y > we denote the sentence: "The machine M on input string 
X does not hah" . 

If the language L is accepted by some Turing machine M, then X is a 
recursively enumerable language. It is obvious that a recursive language is also 
recursively enumerable. The weaker form is still strong enough to enumerate the 
strings in the language. By judiciously employing the machine M that accepts 
the language, the strings of the language can be enumerated. The intuition is 
that, if a string belongs to the language, it will be found eventually. But for a 
string not yet accepted, there is no way of ascertaining that it does not belong 
to the language. 

When a machine is used for decision problems, the output is really encoded 
in the halting states {qy,qn} and the tape contents at halting have no special 
significance. 

Computing recursive functions 

Suppose that / is a function from F* to T* . If there is a Turing machine that 
computes / as in (|2.9|l and (|2.1U|) , f is called a recursive function. 

2.3.5 The Church- Turing Thesis 

The Church- Turing thesis identifies the set of effectively (intuitively) computable 
functions with the set of functions computable within any of the classical com- 
putational models; Turing machines, A-definable functions or general recursive 
functions. It was originally formulated by Church in terms of general recursive 
functions, but Turing made similar remarks in reference to his model, hence the 
name Church- Turing thesis (see several articles in ^Sl)- Historically, effective 
computability meant computability by a human computer who works to pre- 
cise rules. Later the thesis has acquired connotations connecting it to machine 
computation, in particular electronic digital computing machines. In this sense 
the thesis is certainly true; what can be computed by a general purpose digital 
computer can be computed by a Turing machine. 

The literature contains stronger statements to the effect that anything that 
can be computed by a machine can be computed by a Turing machine (for a 
discussion, see 02]). This is a much stronger statement. It is a statement about 
every conceivable physical system that can be harnessed to perform computa- 
tions. Whether it is true or not is unknown. To determine if this statement 
might be true or not, we would have to analyze the general computational 
characteristics of physical systems. Such an investigation seems to require a 
complete theory, or set of theories, covering all of physics. Even though many 
physicist are pursuing research into finding a " Theory of Everything" , it is far 
from clear whether such a theory exists, or if it can be found in any near fu- 
ture. And should such a theory exist, we know nothing of its implications for 
computability. 

^^The role of the Turing machine model of computation for the development of the modern 
digital computer is discussed in I2l1 . 



31 



It is not really primarily a question of theory, but rather of phenomena. New 
physical phenomena might very well be discovered in the future that require new 
theoretical concepts for their explication. 

2.3.6 Computability 

A classic result in the theory of computability is that there are non-computable 
functions. This follows, almost trivially, once one has accepted the following 
three propositions; 

• (i) the set of Turing machine programs is enumerable, 

• (ii) there are non-denumerably many functions f : N 

• (iii) the Church- Turing thesis. 

The first two propositions has the status of mathematical theorems as they 
can be formulated within precisely defined formalisms. The third, cannot be 
proved as it relates the intuitive notion of an effective method to the formal 
models of computation. 

The enumerability of the Turing machines follows from the requirement that 
every Turing machine program must be stated as a finite set of instructions, each 
instruction being built from a finite number of tokens. Each program therefore 
is a finite string of tokens from a finite alphabet, and since such strings can be 
enumerated, the set of Turing machine programs is enumerable. For a complete 
proof, an explicit numbering of the programs must be given, but that can be 
done based on Godel numbering. Also, by explicitly giving an enumeration, a 
particular non-computable function can be exhibited. 

The non-enumerability of the functions f : N —i- N can be proved using a 
diagonal argument. The proof, though standard and well-known, will be given 
here since it illustrates the diagonal method^^ in a simple setting. First note 
that we are considering all such functions, both partial and total. Suppose that 
we are given an enumeration of all functions F = {fn}^=o- We can then define 
a new function u, called the anti-diagonal function [23], where 



This is a well-defined total function. Note that questions of computability do 
not enter at this stage. If the list F is complete, then the function u must be 
one of the functions in the list, say fm and thus u{x) = fm{x) for every number 
X. In particular u{m) — fm{fn) Using the definition of u we get 



This contradiction proves that the list F cannot be complete and the set of all 
functions cannot be enumerated in any way. 

^^The method was invented by G. Cantor. 




/n(") + li otherwise. 



if fn{n) is undefined 
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Compiitability enters when we ask the question whether the anti-diagonal 
function can be computed or not. If the Ust F was compiled using Turing 
machines, i.e. if the list is a list of all Turing computable functions, then the 
proof shows that there are Turing non-computable functions. This argument can 
be used on any computational model. If is a list of all functions computable 
within a certain named model, then there are functions that are not computable 
within this model. A priori, different models of computation could give rise to 
different sets of computable functions. It is an empirical fact that this is not 
the case. 

It turns out that all computational models, claiming to capture the idea of 
effective computability, that has been considered so far, can be proved to yield 
the same set of computable functions. 

The next question is, can the function u be computed effectively at all, using 
some intuitive model? If that is the case, then the classic computational models 
are to narrow. On the other hand, if the Church- Turing thesis is true, then the 
function u is absolutely uncomputable. 

It is clear what the key point is. If we consider total functions, i.e. functions 
defined for all numbers, then the diagonal argument shows that any compu- 
tational model that computes total functions is too narrow. In this case, the 
clause taking care of the cases when the function is undefined, is not needed. 
The anti-diagonal function can be defined, and it is easily proved that it can- 
not occur in the list of functions. Therefore the model is incomplete. But in 
this case, the anti-diagonal function is intuitively computable. This is often 
phrased as saying that we can diagonalize out of any computational model for 
total functions. This intuitive computation of the anti-diagonal function relies 
on examining the list of functions and computing its values based on this list, 
so it could also be seen as a meta-computation. 

The question of whether it is possible to diagonalize out of the model or not 
when partial functions are allowed, depends on whether there is any general 
procedure to determine if a function is defined for a certain number or not. If 
there is such a procedure, it could be used to compute the anti-diagonal function. 

Now, for the Turing machine model, a function is left undefined for a certain 
argument in two cases. Either the machine stops in a non-standard configura- 
tion, a case which can be taken care of by proper programming. Or the machine 
never stops. If there is a general effective method which is capable of determin- 
ing (in a finite amount of time) whether or not Turing computations halt, then 
this method could be used to diagonalize out of the Turing model. This is the 
halting problem. It can, in fact, be shown that the halting problem is unsolvable 
within the Turing model. 

If one could devise a computational model, formal or intuitive, which were 
able to solve the Turing halting problem, then that model would in some sense 
be stronger than the Turing model. To date, there is no such model. 
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The halting problem 

The general halting problem is the problem of designing an effective method, 
intuitive or within the Turing machine model, to determine whether a particular 
Turing machine M„ will ever halt when started to compute with input data m. If 
a certain computation does not halt, this means that the corresponding function 
is undefined. Therefore, the halting problem is closely related to the question 
of computability. 

The algorithm has access only to the Turing machine programs and the 
input data on the tape. This makes sense, because it is of no use just to set 
the machines running and wait to see if they will stop in a standard terminal 
configuration. The machines, 'destined' not to stop, will run forever, and the 
answer cannot be obtained by waiting. 

It can be proved that the halting problem is unsolvable within the Turing 
machine model. The proof is non-trivial and technical, and we will just outline 
it in the next section. 

If the Church- Turing thesis is correct, the general halting problem is there- 
fore unsolvable. 

2.3.7 Universal Turing machines 

The Turing machines considered so far are special purpose machines. Each and 
every machine is constructed to solve a particular algorithmic problem, the pro- 
gram being encoded in the list of instructions. We will now argue that there 
exist universal Turing machines U, which act like general purpose computers. 
They are programmable in the sense that, given a description of a certain Tur- 
ing machine M, and its input x, it mimics the computation of M. Leaving 
open the details for the moment, by a description of M, we mean a symbolic 
representation of the set of instructions for M in the alphabet of the universal 
machine. In order not to clutter the notation, the description of M will also be 
denoted by M since any machine is essentially defined by its set of instructions 
anyway. So, if the result of running M with input x is M{x), i.e. the function 
fM{x), then we write M(x) = U{M\x) to denote that the universal machine 
computes the same result when given as input, the description M as well as the 
'data' X. 

As a preliminary step, note that the Turing machines can be enumerated 
and collected into an infinite list [M^]^]^. The alphabets are fixed and the 
programs can written as strings by concatenating the instructions. Thus, the 
enumeration can be performed using a lexicographic ordering starting by first 
ordering all one-state machine programs, then all two-state machine programs 
and then continuing in this way. 

The actual construction of universal machines is quite complicated if it is 
to be carried out in full detail. One complication is that the different machines 
M could very well have different alphabets T and S, and consequently, the 
universal machine must be able to accommodate a potentially infinite set of 
symbols. However, since for any particular machine, the set of symbols is finite. 
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it is possible to map this set of symbols one-to-one onto a standard set, say 
r = {0,1} and S = {0, 1,#,U} using some binary coding. This will be our 
strategy. 

Furthermore. U must be able to accommodate a potentially infinite set of 
labels for internal configurations of the simulated machines. This we also stan- 
dardize by encoding the configuration labels using the very same alphabet E. 
In this way, both the input data and the program for the simulated machines 
are encoded using the same alphabet. This is useful, since it then makes sense 
to provide the program of a Turing machine as input to the universal machine. 

The internal configurations of U itself may be labeled by any suitable set. 

The construction of U is simplified if it is built as a two-tape machine. The 
first tape can then be dedicated to storing the program for the machine being 
simulated. The second tape of U is used to store the instantaneous descriptions 
of the simulated machine Mi. The specific set of instructions for U itself, which 
in accordance to the Turing machine model, is not stored on any tape, but 
instead is part of its finite state control, can be thought of as an operating 
system. 

We can now informally describe the workings of the universal machine. Upon 

being set in motion, it scans the leftmost symbol on the second tape (this is 
the starting configuration of Mj), then it scans the next symbol to the right 
(the symbol that Mj itself would have scanned). Having done this it knows the 
both the internal configuration and scanned symbol of Mj. Then it scans the 
first tape, looking for a matching instruction. If such an instruction is found, 
it is performed on the second tape. Thus the first step in simulating Mi is 
performed. Next it scans the second tape looking for a symbol corresponding 
to a configuration of Mi, then it scans the symbol to right (which again is 
the symbol scanned by Mi). Then it scans the first tape again looking for a 
matching instruction. Having found it, it is performed. Continuing in this way 
it is clear that the workings of Mj is simulated. What remains to be done if the 
construction is to be carried out in detail is to code these operations in terms 
the primitives of U. 

2.3.8 The halting problem is undecidable 

We are now in a position to state the halting problem and prove that it is 
undecidable. In order to use the formalism set up so far we will phrase the 
problem in terms of decision problem. 
Let H be a language defined by 

ll = {M;x:M{x)-^>}, (2.13) 

which is read out as "The language consisting of all strings that encode a 
Turing machine M and an input x such that the machine halts on the input." . 

Theorem 

H is recursively enumerable. 
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Proof 



What is needed is a Turing machine H that accepts the language H. According 
to the definition recursively enumerable languages 12.12|l 

[H{M-x)>qy iiM;xen , . 

\H{M;x)yt> ifAf;a;^H' ^ ' 

But then H is precisely a universal machine programmed so that it halts in 
the accepting configuration qy whenever the machine M halts on input x. 

Theorem 

H is not recursive. 

Proof 

Suppose contrary to the proposition that there exist a Turing machine H that 
decides H. This means, according to H2.11() . that we have 

HiM;x)yqy ifM;.TeH 

H{M;x)>~qn ifAf;a;^H" ^ ' 

Now since both the input data and the program are encoded using the same 
alphabet, it is possible to write programs for the universal machine that takes 
programs for other Turing machines as input. Consider therefore a new machine 
D which is a modification of the machine H and which takes the description M 
as input. It is defined by 

m/in-/^^^)^^ if i?(Af;M)^g, , . 



Under the assumption that H exists, this is a perfectly well defined compu- 
tation. D can be explicitly defined by just adding a small set of instructions to 
H{M; x) directing it to move right forever in the case that H{M; M) accepts 
(which it does eventually by assumption), or directing it to accept if H{AI; M) 
accepts (which it also does eventually by assumption). 

Let us spell out the definition of D explicitly. On given input M, D first 
simulates the universal machine H on input M; M. Then, in the case where M 
should have entered the accepting configuration qy, D, which is reprogrammed 
(as compared to H) to 'loop', just continues to move forever right along the 
tape. In the case where M should have entered the rejecting configuration q„, 
D is reprogrammed to accept. 

But now we can ask how D behaves when it is run with a description of itself 
as input, i.e. how does D{D) behave? The definition of D{D) immediately gives 

^ ' \D{D)^ qy H{D]D)^ qn ^ ' 

Let us analyze this. Does D{D) halt or not? 
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Suppose it does not halt, i.e. D{D) >- >. That case occurs when H accepts 
the input D;D. But then it follows from the assumption (|2.15l) about H that 
D; -D e H. This, in its turn implies D{D) )/- [> using the definition H2.13|l of the 
language H. 

On the other hand, suppose D{D) does halt. That case occurs when H 
does not accept the input £>; D. But then it follows from the assumption H2.15|l 
about H that D;D^ H. Consequently, D{D) does not halt according to the 
definition (|2.13|) of the language H. 

Both ways, we get a contradiction. The conclusion is that the universal 
machine H deciding H does not exist. 

Note, that this proof hinges on a delicate interplay between the definition 
H2.13|l of the halting language H, the assumed properties of the universal ma- 
chine H purported to decide H and the derived properties of the 'diagonal' 
reprogrammed machine D. 

2.4 The classical circuit model of computation 

The circuit model of computation is based on the classical logical gates like 
AND, OR and NOT. Since the 70's these are implemented as physical devices 
in the form of TTL or CMOS technology. In a microprocessor there are millions 
of gates, but they are also packaged in components containing a few gates, which 
can be wired together on circuit boards using traditional soldering techniques. 
The abstract logical values {true, false} are represented by voltage levels. In 
this section we will review the circuit model as a theoretical model of compu- 
tation, but everything in this model have a physical realization in terms of real 
world circuits and wires. 

A circuit is made up of wires and gates. The inputs and outputs of the gates 
are bits, either represented by {true, false} or more conveniently by {0, 1}. A 
single gate might have any number of inputs and outputs, though in practice 
the basic building blocks have just a few inputs and outputs. 

A circuit with k inputs and 1 outputs corresponds to a function 
/: {0,1}^- ^{0,1}'. 



/ 



inputs outputs 

Figure 2.1: A general gate. 

Bits are carried from one gate to another through wires. By connecting gates 
with wires a circuit is built. No loops or feedback are allowed in the circuit as 
that generally leads to instabilities. Wires can be split into two or more wires. 
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thus duplicating the bit they are carrying. All circuits can be built using just 
one type of logical gate, often chosen to be the 2-input gate NAND. NAND is 
an AND gate followed by a NOT gate. The NAND gate is therefore said to be 
universal for classical (non-reversible) computation. Circuits are often easier to 
construct and understand if one allows oneself to use a larger set of gates: NOT, 
AND, OR, NAND, XOR. 

The basic circuit elements 
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Figure 2.2: The basic circuit elements. 



Input/output relations for the basic circuit elements 

The outputs of these gates for different combinations of inputs are given by the 
following table. 
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The NOT gate simply flips to 1 and 1 to 0. 



This is sometimes coded in terms of a FANOUT gate, see below. 
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NOT X 
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1 






Apart from the logical gates one also needs the FANOUT gate. It is not 
really a logic gate but just a splitting off of a wire into several wires, all carrying 
the same bit as the original wire. Electrically this is in fact just a splitting 
off of a wire, although in practice, since in real wires there are small currents 
flowing, one might get problems with the voltage levels at the outputs if a gate 
is drained to heavily. There are therefore limits on maximum fanout for real 
electronic gates. 



X 



X 
X 



X 
X 



"fanout" 



Figure 2.3: The fanout gate. 

Below, we give just on example of a simple circuit, the half-adder, which can 
be used as a building block in a circuit for binary addition. 




X + y mod 2 



Figure 2.4: A half- adder circuit. 

One further concept is the ancilla or auxiliary (work) bit. It is a fixed bit, 
set to or 1 once and for all. Physically this is realized by a fixed voltage. 

As is well known, there is a complete isomorphism between the circuit model 
and Boolean algebra and the functions computed by a circuit are often called 
Boolean functions. In fact the easiest way to analyze a circuit is using Boolean 
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algebra. There is also a very close correspondence with prepositional logic, 
although in logic the focus is different. 

Note that there is no computational steps involved here apart from the time 
it takes for the basic circuit elements to compute the outputs from the inputs. 
For real physical devices this time is of the order of nanoseconds. Apart from 
this time lag, the outputs appears as soon as the inputs are applied. 

2.4.1 The circuit model and non-computable functions 

We will now prove that there are circuits to compute any function 
/ : {0, 1}'^ {0, 1}'. The proof uses induction over the number of input bits. 

Theorem 

Every function / : {0, 1}*^ — > {0, 1}' has a circuit that computes it. 
Proof 

First note that it suffices to prove the assertion for functions / : {0,1}*' {0,1} 
as the Z-bit output case is easily put together from / 1-bit output functions. For 
k = there is nothing to prove. For k = 1 there are four possible functions: 

1. The identity. A circuit consisting of a single wire computes this function. 

2. The bit flip. This function is computed by a NOT gate. 

3. The constant function with output 0. This is computed by an AND gate 
with one input bit taken to be an ancilla bit equal to 0. 

4. The constant function with output 1. This is computed by an OR gate 
with one input bit taken to be an ancilla bit equal to 1. 

For the induction step, assume the assertion true for k = n . Now let / be 
a function of n + 1 bits. Deflne n bit functions /o and /i 

/o(xi, ...,Xn) = /(0,a;i, ...,Xn) 

...,Xn) = /(l,a;i, ...,.T„) 

These are both n-bit functions and are therefore computed by circuits. The 
function / is now computed by the circuit implementing the formula 

f{xo,xu Xn) = (xoAND/o(a;i, a;„))XOR((NOTa;o)AND/i(a;i, a;„)) 

or using the more convenient Boolean algebra notation 

f{xo,Xi,...,Xn) = {xoAfo{xi,...,Xn))® {-'Xq A fi{xi, Xn)) 

i^In propositional logic the focus is on the logically true propositions and the notion of 
completeness, i.e. the question of whether the logically true, and only the logically true 
propositions can be derived within the system. This question has since long been settled. 
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. The rcsiilt now follows by induction. 

So, using gates and wires, any function / : {0, 1}*^ {0, 1}' whatsoever can 
be computed. This does not, however, mean that any function f : N ^ N 
can be effectively computed using circuits. That would run counter to the well- 
established fact that there are non-computable functions. This is an interesting 
point that we will examine in some detail. Suppose we want to compute a simple 
function like f{x) = for all values of the argument. No single circuit can do 
this, as it is immediately clear that the input must be represented in the form of 
a bit string, or a binary number, and each bit must be carried by a single wire. 
So a 2-bit input circuit can calculate the function for at most the numbers 0, 1, 
2, 3. A 3-bit input circuit manages the numbers through 7. So what we really 
need in order to compute the square function is a enumerable infinite family of 
circuits. Let us make this notion precise. 

Consistent circuit families 

A consistent circuit family consists of denumerably infinite set of circuits {C„}^q 
with the properties 

1. The circuit C„ has n input bits and a finite number of extra ancilla bits 
as well as a finite number of output bits. 

2. The output from C„ is denoted by Cn{x) and is defined for all binary 
numbers x of at most n bits of length. 

3. lim < n and x is at most m bits in length then Cm{x) = Cnix). This is 
the consistency requirement. 

What prevents the circuit model from yielding an effective method for com- 
puting every function whatsoever is the fact that there does not exist an eff'ective 
method to construct the circuits in the family for every number n. 

By a uniform circuit family we mean a consistent circuit family for which 
there does exist an algorithm, for example running on a Turing machine, which 
computes a description of the circuit for every number n. In this way, the 
uniform circuit model is by definition equivalent to the other models of compu- 
tation. 

From this we see a fundamental difi^erence between the circuit model and the 
Turing machine model. Once a Turing machine is programmed, it will compute 
the values of the function for every input number for which it halts. A given 
circuit with a given number of input bits only computes the function for a finite 
range of values. Beyond this range of values, a new circuit (in the family) is 
demanded. 

Another way of looking at the fact that non-uniform circuits compute all 
functions is that the non-uniform circuit model cannot be finitely described. The 
list of circuits is infinite and we have no finite way to generate the list of circuits. 
It therefore falls outside the characterization of a finitely defined algorithm. It 
is clear that circuits can be built to compute any computable function, but in 
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this way one gets specialized circuit families for each computational task. In 
order to get a universal model of computation, the circuit must be wired to 
perform a standard set of instructions on input data, the computational task 
itself being supplied as a program. This is the way an ordinary von Neumann 
architecture digital computer works in a (fetch instruction, fetch data, execute 
instruction, save data) cycle. The circuit must thus be clocked and in this way 
computational steps are introduced. 

The circuit model is useful to describe quantum computation, but in itself it 
is rather awkward. What makes an algorithm for an infinite set of instances of 
a problem useful is the fact that once the algorithm is known, it permits us to 
obtain new knowledge. If we do not know the value of a computable function 
for a certain argument, run the algorithm to find that value out! Using circuits, 
new members of the circuit family must be computed in order to get new values 
of the function. 

In a sense, this is a reflection of the fact that the circuits really just furnish 
formulas for the function values, not algorithms. If there is an explicit formula 
for a function, no algorithm is needed to compute the values. 



2.4.2 Reversible gates 

The classical logical gates are all irreversible except for the NOT gate. This 
means that the values of the input bits cannot be inferred from the values of the 
output bits. Just one example illustrates the point. If an AND gate outputs 
0, there is no way to know which of the possible input combinations 00, 01 or 
10 resulted in the output. 

In |S] and |10| it was shown that an irreversible logical operation has to 
dissipate a certain minimum amount of energy. On the other hand, a reversible 
logical operation does not have to dissipate any energy. This lead to an interest 
in reversible computations, and this was also one of the initial motivations be- 
hind research into quantum computation, since the evolution of closed quantum 
systems are inherently reversible. 

The first requirement for a gate to be reversible is that the number of output 
gates equals the number of input gates. A simple example is the CNOT gate. 
It has two input bits and two output bits. 



X 



y 



X' 



y 



Figure 2.5: The CNOT gate. 
The function computed by this gate is shown in the 'truth' table below. 
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The name CNOT stands for controUed-NOT. The first input bit x con- 
trols the second input bit in such a way that when a; = , the second output 
bit y' — y, and when x — 1, then y' = NOTy. The first output bit x' is always 
equal to x. Note that the gate can also be regarded as a generalization of XOR 
since y' = xXORy. 

The fact that the CNOT-gate is reversible can be seen in two ways. First, 
by simply inspecting the truth table, it is clear that knowing x' and y', x and 
y can be uniquely derived. Secondly, if a second CNOT is connected after the 
first CNOT, the total effect will be same as just to unit wires, i.e. CNOT is 
its own inverse. 

The functional relations between inputs and outputs can thus be written 

( x' ~ X 
\y' = x®y' 

The CNOT gate can be used to make a copy of an input bit. If y in the 
truth table is fixed to 0, both x' and y' are equal to x. 
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This can also be seen as a FANOUT. 

Another reversible gate is the Toffoli gate. It has three input wires and three 
output wires. 

X • X 

y — y 

z © z' 

Figure 2.6: The Toffoh gate. 
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The function computed by this gate is shown in the 'truth' table below. 
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The functional relations between inputs and outputs can be written 

x' = X 

y' = y 

y' = {x Ay)® z 

The first two bits, x and y, can be regarded as control bits, they are not 
changed by the gate. Instead the AND of x and y determines whether the third 
bit z is flipped or not. The third bit can therefore be regarded as a target bit. 
This terminology is used in quantum computation. 

The reversibility of the Toffoli gate can be seen in exactly the same way as 
for the CNOT gate. 

The Toffoli gate turns out to be universal for reversible computation. This 
is easily seen as it can be wired as to mimic a NAND gate. Fixing the z-input 
wire to be 1, we get z' = (x Ay) ® 1 = -i{xAy) = x/y. This is seen by restricting 
the truth table to the rows where z = 1. 
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It can also be wired to mimic a two-wire FANOUT. Fixing the first input 

to 1 and the third to 0, the bit on the second input appears on the second and 
third output. This is seen by restricting the truth table to the rows where x = 1 
and = 0. 
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Clearly y' = y and z' = y. 

The bits that are fixed to constant values in these constructions are called 

ancilla bits. 

We should also mention the Frcdkin gate in this context. It is a universal 
reversible gate with three inputs and three outputs. It has one control bit x and 
two target bits y and z}^^ If the control bit is 0, the target bits goes through 
unchanged, whereas if the control bit is 1, the target bits are swapped, i.e. 
y' = z and z' — y. 

The Toffoli gate is more useful in quantum computation. 

2.4.3 Reversible circuits and un-computation 

Relying on the universality of Toffoli gates, a circuit wired with NAND and 
FANOUT gates can be rewired into a reversible circuit. In order to do that, 
extra "ancilla" bits are needed. Furthermore, the Toffoli gates outputs one or 
two extra bits (depending on whether they mimic FANOUT or NAND) not 
needed in the computation. These bits only serve the purpose of making the 
computation reversible. The extra output bits from each Toffoli gate add up 
to what essentially amounts to " garbage" . It would be nice to be able to have 
the ancilla bits in a standard state and to get rid off the garbage bits. Simply 
erasing them will not do, as that would spoil the reversibility. However, there 
is a procedure to clean up the garbage using precisely this reversibility! 

Suppose we have a non-reversible circuit computing a function / on some 
n -bit input x. We want to do this computation reversibly while cleaning up the 
garbage. If the non-reversible computation is represented as 

X ^ f{x), (2.18) 
we can represent the reversible computation as 

{x,a)^{f{x),g{x)), (2.19) 

where a denotes the ancilla bits needed to wire the Toffoli gates and g{x) 
denotes the resulting garbage bits. The ancilla bits, as well as x can be thought 
of as bit strings stored in appropriately sized registers. 

In order to put the ancilla bits in a standard state, we allow the use of NOT 
gates. These are reversible. Then all ancilla bits can be O's, using NOT gates 
where I's are needed. So now we have (x, 0) {f{x),g{x)) with denoting a 
bit string with just O's. 

^^To be consistent with terminology, we ought to speak about input and output wires instead 
of input and output bits. However this is common abuse of language. 
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Furthermore, allowing the use of the CNOT gates (also reversible), wc can 
do two things. First, a copy of the input bit string x can be made, so that the 
computation now reads 

[x, 0, 0) ^ X, 0) ^ !{x), g{x)) (2.20) 

where the first arrow corresponds to the copying action of the initial CNOT 
gates. 

Now we can introduce the idea of uncomputation. This is a procedure that 
allows us to get rid of the garbage bits by inverting the circuit, and so to speak, 
uncompute the garbage back to 0. But of course, the result of the computation 
f{x) must be saved before. This can be done by introducing a fourth input 
register y with the same size as the register needed to store the result f{x). The 
register y is not used until the computation of f{x) is finished. Then CNOT 
gates arc used to add the result bitwise to the bits in y. Now the computation 
reads (suppressing the initial cc-copying), 

{x, 0, 0, y) ^ {x, f{x), g{x),y) ^ {x, f{x), g{x),y f{x)). (2.21) 

All steps performed in this computation up to the last ones involving y are 
reversible and none of them affects the fourth register, so reversing this part of 
the computation, yields 

(x,/(x),5(x),j/®/(x))-^ (x,0,0,y©/(x)). (2.22) 
The complete computation now reads 

{x,d,o,y)^{x,OAy(Sf{x)). (2.23) 

Note that this is still a reversible computation, as the ® part could also be 
reversed, giving the bit string y back. But, of course, we don't want to do that. 
So, suppressing the ancilla bit strings, we have simply 

{x,y)^{x,y(Bf{x)). (2.24) 

2.4.4 Reversible computation and physics 

Up to this point we have only discussed "logical" reversibility. The computation 

is reversible in the sense that the input can be recovered from the output by 
carefully keeping track of every bit during the computation. This is very close to 
"physical" reversibility by which is meant that the time evolution of a physical 
system can be reversed so that an initial state of the system can be recovered 
from a final state. When a computational process is carried out by a physical 
system^^ there is precisely such a time evolution involved, so the two concepts 
of reversibility must be closely connected. 

i^There must always be some underlying physical system performing the computation, even 
when computation is viewed abstractly as mere symbol shuffling. Someone or something must 
shuffle the symbols. 
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The microscopical laws of dynamics are all reversible, whether classical or 
quantum. Reversible dynamics does not dissipate any energy. In order for this 
to make sense one must really talk about closed physical systems, i.e. systems 
that do not in any way interact with the environment. One must also have full 
control over all degrees of freedom, i.e. the dynamics of every degree of freedom 
must be governed by fundamental (time-reversible) equations of motion. Such 
systems are conservative, meaning that the time evolution is reversible and no 
energy is dissipated into the environment. 

Comparing this to a computational process we can guess that it is the loss of 
control of some of the individual bits, inadvertently or deliberately, that leads 
to energy dissipation in a computational process. 

Indeed, as has been studied by Landauer 0, erasure of one bit of informa- 
tion leads to an energy dissipation given by kBThi2. Here ks is Boltzmann's 
constant, a fundamental constant of physics relating mechanical quantities to 
thermodynamical quantities like energy and entropy. T is the temperature of 
the environment into which the energy is dissipated. 

Physically, this possibility of performing computations reversibly is of more 
theoretical interest than practical. The solid state hardware of today dissi- 
pate energy far above the kt,T\n2 limit. Even if solid state circuits can be 
manufactured that performs reversible logical operations like the Toffoli gate, 
these devices must be powered by some voltage source like any other electronic 
gate. Tiny electric current will flow and there will be heat dissipation from 
electric resistance. Even if this effect can be minimized, perhaps by exploiting 
superconductivity, the inevitable weak interaction with the environment will 
generate noise that will have to be corrected. The error correction registers 
employed must eventually be erased, since memory is always finite, leading to 
energy dissipation. 

For now, classical reversible computation serves just as a backdrop to quan- 
tum computation, which is inherently reversible. We will return to the subject of 
reversible computation in chapter 8 on physics of computation, and in particular 
to the question of the thermodynamics of computation. 

2.5 Comparison to real computers 

Neither the Turing machine model nor the circuit model is very close to the 
actual workings of a modern digital computer. In what sense then are they 
models of real world computers? It is generally agreed that all present day 
general purpose digital computers are von Neumann machines, machines that 
store both data and program in a memory and which works in a cyclic way of 
(fetching instructions and data, executing instructions, storing data). This is a 
rather vague description of the basic workings of a computer and cannot by itself 
serve as a model of computation. There is however a computational model that 
very closely captures the workings of a modern computer - the Random Access 
Machine-model. It has a CPU with temporary storage registers, a program 
counter and an ALU - much as in a real world processor. The CPU is connected 
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to a (random access) memory which stores both data and program. In order 
to read and write in arbitrary memory locations, every memory cell have an 
address. The model can be programmed in an assembly-like language. It is 
clear from the close analogy to real computers that the model has expressive 
strength enough to support compilers and higher level languages. 

The main difference between the Turing machine and real computers is that 
its memory is not accessible immediately. In order to read a square away from 
the present position of the read/write head, all intermediate squares must be 
traversed and read. 

The Random Access Machine (RAM) can reach an arbitrary memory cell in 
a single step. It can be considered a simplified model of real world computers. 

The same functions are computable on Turing machines and on the RAM. 

2.6 Non-deterministic Turing Machines 

The models of computation considered so far are deterministic, i.e. at each step 

in the computation, the next step is exactly determined by the program and the 
data. However, non-deterministic models of computation are theoretically very 
important and often lead to simplification of analysis, though they cannot in 
general be efficiently implemented. Referring back to the definition of a Turing 
machine, we see that in the set of instructions defining the program, there is at 
most one instruction with a certain combination of scanned tape symbol and 
machine state. This makes the computation deterministic, i.e. the action of 
the machine is uniquely determined. Removing this restriction leads to non- 
determinism. 

Informally, for any combination of scanned tape symbol and machine con- 
figuration, we allow a set of possible instructions. 

Formally this is easiest to formulate in terms of the transition function. 
Remember the transition function 

6:Qxr^{QL\Qh)xTxM, 

which maps combinations of scanned symbol and machine configuration into 

the set of combinations of configurations, symbols and moves. 

Consider the set of all subsets of the set {Q U Qh) x S x M, this is the power 
set VdQ U Qh) X E X M). Allowing several different instructions having the 
same tape symbol and configuration can be formulated in terms of a transition 
function that maps into this set of subsets 

6:QxT^ V{{Q U Qfc) X r X M). 

How does a non-deterministic Turing machine compute? Suppose that dur- 
ing the computation the machine finds a matching pair of scanned symbol and 
machine state for which there are several instructions. The computation then 
branches off into parallel computations, one for each possible way to proceed. 
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The state graph for a non-deterministic computation is therefore a directed tree, 
whereas for a deterministic computation it is a Ust. 

In order to actually carry out such a computation in parallel, one would have 
to assign new computational resources at each branch in the graph, in the form 
of new processors or new Turing machines. In practice, this is not possible in 
the general case where the maximal number of processors are limited. 

The alternative would be to traverse the tree, breadth- first, using just one 
processor. Exponential resources are needed in the generic case, in the form 
of increasing time and space requirements. 

It is quite easy to argue that the set of computable functions arc the same. 
Suppose a partial function is computable by a non-deterministic Turing ma- 
chine. This means that the function values are found at the ends of terminating 
branches of the computation graph. Performing the computation breadth-first 
on a deterministic Turing machine, we are guaranteed to eventually reach the 
halting states, possible after consuming an exponential amount of time and 
space. The computation might take exponential time to systematically work 
through the ever increasing number of branches, and use an exponential amount 
of tape to record information about the state at not yet processed branch points. 
Tape can be reclaimed but not time. Still, the function is computable on a de- 
terministic Turing machine. 

2.6.1 A note on classical pcirallelisni 

Non-determinism offers a kind of parallelism. Parallel computation and non- 
deterministic computation are overlapping concepts but they are not the same. 
Parallel computation does not involve an unbounded number of parallel pro- 
cesses, as there is always a maximum number processors available in any real ma- 
chine. On the other hand, parallel processes can communicate, by shared data 
or by passing data (messages) , and that need not the case for non-deterministic 
algorithms. Parallel algorithms and parallel computation is a huge subject and 
there are several different models for parallel computation but no generally 
agreed on paradigm. 

One might wonder if classical parallelism is a threat to the Church- Turing 
thesis? That is, is it possible to compute non-computable functions using paral- 
lel computation? The answer is no, and the argument is similar to the argument 
in the case of non-determinism. 

2.7 Probabilistic Turing machines 

There is a close connection between probabilistic Turing machines and non- 
deterministic Turing machines. In non-deterministic machines, the computa- 
tion can branch of into different sub-computations, in principle at every node in 
the computation. Consequently, the computation graph becomes tree. Now, if 

^^Depth-first traversal runs the risk of going down a non-terminating branch, so breadth- 
first is the best option in an actual simulation. 
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the computational graph edges leading out from a node allowing branching are 
assigned probabilities, and a probabilistic choice as to which edge to follow is 
made, we get a probabilistic Turing machine. In this case, the computation be- 
comes a directed path through the computation graph. Of course, different runs 
of the same machine, with the same input, will give different paths depending 
on the random choices made at each branch node. 

Given a perfect random number generator rnd, a probabilistic machine can 
be easily simulated on a deterministic machine by performing calls to the rnd 
at each step allowing for probabilistic choices. 

Formalizing this concept will yield a first step towards an understanding 
of quantum Turing machines as well a providing a background for discussing 
where quantum computation departs from classical computation. Quantum 
Turing machines arc treated in chapter 5. 

The point of departure is the transition function A of section 2.3.2. There, 
the values and 1 of the function determined whether the transition was present 
in the program or not. Thinking of the numbers and 1 as probabilities one is 
lead to extend the range of A to numbers p in the interval [0, 1], and interpreting 
p the probability for the transition, i.e. defining 

A : X r X (0 U Q,,) X r X Af ^ [0, 1]. 

Referring back to the computational tree of a non-deterministic machine, we 
can turn this into a computation tree for a probabilistic machine by marking up 
each branch node with probabilities. The probability to reach a certain node in 
the tree is the total calculated probability to reach that node from the initial 
starting node of the computation. 

For theoretical purposes we could then consider the probabilistic machine 
as being, at each computational step, in a (classical) superposition of all the 
reachable states (from the start). Denoting the states CqiTZ with |s),^^ we can 
formally write this superposition as a sum X]p<,|s) where the summation runs 
over all states reachable at the given stage in the computation. 

This superposition is an entirely theoretical construct. Physically there is no 
such superposition for a classical probabilistic Turing machine. Each separate 
execution of the machine simply traces out a path in the computation graph. 
Theoretically, however, we can speak of the machine as being in a superposed 
state. Observing (or measuring) the machine after a certain number of time 
steps, we will find the it in a certain state with a certain probability. This 
probability is the same as the probability to reach that state from the initial 
starting state. The classical computation of a probabilistic machine can be 
observed at each state, thus tracing out the particular execution path. This 
observation, or measurement is of no consequence for the future execution of 
the machine. 

Here we have two major differences as compared to quantum machines. 
Firstly, for quantum machines, the corresponding superpositions (somewhat dif- 
ferently defined though) do actually exist. Secondly, and as consequence of the 

i^The notation \s > will subsequently be used for true quantum states. 
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reality of the superposition and the nature of quantum dynamics, the measuring 
a quantum Turing machine will affect the future execution of the machine. 

Returning to the classical probabilistic machine, it intuitively appears that 
restricting the branching probabilities to {0, 1/2, 1}, 1/2 corresponding to fair 
coin tossing, ought to be sufficient. This can in fact be proved [??]. 

2.8 Some Complexity Theory 

Computability theory discusses what can be computed in principle by investi- 
gating the boundary between computable and uncomputable functions. Com- 
plexity theory discusses what can be computed in practice by analyzing the 
amount of resources needed in a computation. 

The complexity is calculated by analyzing the algorithm, not by running 
the computation, so clearly, in order for complexity to make sense, it must be 
defined for decidable problems or computable functions. 

Complexity is most conveniently discussed in terms of the computational 
resources required to decide recursive languages, i.e. in terms of decision prob- 
lems. A computational resource can be time, roughly measured as the number 
of steps required by an algorithm. Another resource is space, corresponding to 
the amount of memory required. It could also be some other physical resoiirce 
like energy, but time and space are the measures are the most important from 
the point of view of difficulty of algorithmic problems. In the circuit model, 
complexity is naturally measured in terms of the number of gates in the circuit. 

A distinction is made between tractable problems and untractable problems. 
A problem is tractable if it can be solved on a computer using a reasonable 
amount of CPU-time and/or memory. It is well known that there is a dramatic 
difference in the growth rate of polynomial functions and exponential functions. 
Reasonable amount of resources are those that grows at most as a polynom in 
the size of the problem. 

Algorithms are not in general intended to solve particular problems, but 
rather sets of problems, parameterized in some way. A particular problem in 
the set is called an instance. In general the instances are increasing in size 
in terms of the parameters. When analyzing a certain algorithm for a certain 
problem (for example, insertion sorting for sorting) it is in general the worst- 
case behavior that is interesting. In that case we are interested in upper bounds 
on the amount of resources required by the algorithm. 

When analyzing classes of algorithms for a certain problem (for example, 
the class of sorting algorithms) it is rather lower bounds that are in focus. We 
want to know the performance of the best possible algorithm. 

We will now make these notions exact. First of all we need a model of 
computation. In general, different models of computation can have different 
strengths. However, the so called slowdown between different reasonable mod- 
els is polynomial, and therefore not important in theoretically. Of, course, in 
practical computing, even small increases in speed can be important. 

i^This is in contrast to the situation as regards computability. 
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In order to treat the complexity of algorithmic problems in a uniform way, 
the problems are formulated in terms of formal language theory. Problems are 
coded using some alphabet and problem instances then corresponds to strings in 
the set of all strings S*. A decision problem then amounts to deciding whether 
a given string belongs to the language (which is a subset S*, see section X.X.X) 
or not. 

This section on the theory of complexity will be very brief and just record the 
basic definitions and results of the topic without proofs or detailed explanations. 
A good modern reference is JS], see also and PU). 

2.8.1 Measures of complexity 

Any 'reasonable' model of computation can be used to set up the theory of com- 
plexity. What is needed for measuring the time complexity is some consistent 
way of counting computational steps in terms of a unit of time for performing 
some elementary step. There is a good deal of arbitrariness here both regarding 
what is a step and what is an elementary unit. This arbitrariness is however in- 
herent to the problem, and in the end does not matter much, as good measures 
of complexity only differs polynomially. 

It is anyway not the exact number of steps that is important. Rather, we 
need a concept of complexity that is robust to incremental improvements in 
hardware and software, and yet sensitive to more dramatic developments, of 
which quantum computation is an example. 

The Turing machine model will be used here. Time complexity will be 
defined in terms of the number of steps taken by the machine during the com- 
putation. Space complexity will be defined in terms of the maximal number of 
tape squares needed by the computation. These complexity measures will be 
functions from the size of the input to the number of computational steps and 
the number of tape squares respectively. Input size is defined as the number of 
written tape squares. 

Let L be a recursive language decided by a Turing machine M . Referring 
back to section 2.3.4 we recall that this means that M halts on all inputs x 
in either the 'yes' or the 'no' configuration depending on whether the string x 
belongs to the language or not. Computational resources is measured in terms 
of the size of the input x and will be taken as the number of non-blank tape 
squares in the start configuration of the machine. This is called the length of 
the input, i.e. it is just the length of the string written on the tape in the start 
configuration. 

This is a simplification, since the number of steps required by an algorithm 
may depend on several of the parameters defining the instance of the problem. 
For example, in a graph problem, both the number of nodes and the number 
of edges may have an effect on the running time of an algorithm. On the other 
hand, the maximum number of edges in a graph with n nodes is 'n{n — l)/2, 

^''The extent to which quantum computation is stronger than classical is not yet fully 
understood. 
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so using the square of the number of nodes we get a reasonable measure of the 
instance size. The idea is here that the instances are coded in some way on 
the machine tape, and coding both node data and edge data, we take this into 
account when we measure the instance size as the length of the string written 
on the tape. 

Asymptotic upper bounds 

The " big O" notation O is used to set upper hounds on the asymptotic behavior 
of functions. We want to capture the notion that the function / is asymptotically 
bounded by the function g. 

Let / and g be two functions from the natural numbers to the positive real 
numbers. f{n) is in the class of functions 0{g{n)), or simply /(n) = 0{g{n)) if 
there exist positive integers c and no such that f{n) < cg{n) for every integer 
n> TiQ. This simply says that for sufficiently large n, the function / is bounded 
from above by the function g apart from a constant factor. 

Asymptotic lower bounds 

For lower bonds, the "big Omega" is used. 

Again let / and g be two functions from the natural numbers to the positive 
real numbers. f{n) is in the class of functions f2((/(n)), or simply f{n) — Q{g{n)) 
if there exist positive integers c and hq such that cg{n) < f{n) for every integer 
n> Uf). This simply says that for sufficiently large n, the function / is bounded 
from below by the function g apart from a constant factor. 

Asymptotic behavior 

If a fimction / is in both 0{g) and ^(,9), i.e. if it, apart from constant factors, 
is bounded both from above and below by the same function g, then it behaves 
asymptotically as g. The "big 6" notation is used to indicate this. 
Thus, /(n) is in Q{g{n)) if it is in both 0{g{n)) and Q{g{n)). 

Time complexity 

The time complexity of a deterministic Turing machine M is a function 
f : N ^ N, where / is the maximum number of steps performed by M during 
any computation with input length n. 

This is also phrased in any of the following ways: / is the running time of 
M, M runs in time /, M is a time / machine. 

The time complexity class TIME{f{n)), is defined as 

TIME{f{n)) = {L I L is decided by an 0{f{n)) time machine}. (2.25) 

There is a corresponding notion of time complexity for non-deterministic 
computations. The time complexity of a non-deterministic Turing machine M 
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is a function f : N ^ N, where / is the maximum number of steps performed 
by M on any branch of the computation with input length n. 

Space complexity 

The space complexity of a deterministic Turing machine M is a function 
f : N ^ N, where / is the maximum number of tape cells scanned by M 
during any computation with input length n. 

This is also phrased in any of the following ways: M runs in space /, M is 
a space / machine. 

The space complexity class SPACE{f{n)), is defined as 

SPACE{f{n)) = {L I L is decided by an 0{f{n)) space machine} (2.26) 
Analysis of algorithms 

Algorithms are analyzed by roughly estimating the number of steps required to 
perform the parts of the algorithm. Textbooks on complexity theory normally 
goes through the techniques of doing this. The details differs from model to 
model depending on the programming primitive available. We will bypass this 
topic here, and just rely on our intuition in the simple models we are concerned 
with, Turing machines and circuits. 

2.8.2 Complexity classes 

For easy reference, we will briefly review the definitions of the basic complexity 
classes. 

The class P 

The time complexity class P is the collection of all languages that are in 
TIME{n'') for some constant k. That is, a language is in P if it can be decided 
by a deterministic Turing machine whose running time is bounded from above 
by a polynomial in the number of steps. 

The class NP 

NP is an extremely important time complexity class. Its name is an abbre- 
viation of Non-deterministic Polynomial. It is defined as the collection of all 
languages that are in NTIME{n'^) for some constant k. That is, a language is 
in NP if it can be decided by a non-deterministic Turing machine whose run- 
ning time is bounded from above by a polynomial in the number of steps. This 
class is potentially larger than P, indeed P C NP. 

There is another characterization of NP that do not refer to non-deterministic 
computations. It is based on the easy (polynomial) verification of a "yes" in- 
stance by a so called witness. However, there need not be any such witnesses 
for "no" instances. 



54 



A good example is factoring of integers. Suppose the language under con- 
sideration is the set of composite natural numbers, i.e. non-prime numbers. A 
"yes" instance, let's say a number n, can always be ascertained by exhibiting a 
factor, say wy. By simply dividing n by wy it can be verified (in polynomial 
time) that n is indeed composite. On the other hand, supplying a "no" witness 
wn is quite useless, since even if wn does not divide n, there might very well 
be another "yes" witness not yet exhibited. So, without deeper insight into the 
problem of determining whether a number is prime or composite, all tentative 
factors must be checked before the verdict prime can be passed. 

A language L is in NP if there is a Turing machine such that 

• li X £ L, there exist a witness w such that when the machine is started 
with X and w as inputs, its halts in the "yes" state after a time polynomial 
in the size of x. 

• li x ^ L, then for all purported witnesses w, the machine halts in the "no" 
state after a time polynomial in the size \x\ of a;, when started with x and 
w as inputs. 

It is not known whether P is a strict subset of NP. The conjecture P ^ NP 
is one of the main unsolved problems in complexity theory. 

The class PSPACE 

The class PSPACE is the space analogue to P. It is defined as follows. 

The space complexity class PSPACE is the collection of all languages that 
are in SPACE{n^) for some constant k. That is, a language is in PSPACE if 
it can be decided by a deterministic Turing machine using a number of working 
bits polynomial in the input size. There is no limit to the amount of time used. 

It is clear that P is included in PSPACE simply because a machine that 
halts after a polynomial number of steps can only traverse a polynomial number 
of tape squares. Thus, P C PSPACE, but it is not know whether the inclusion 
is strict, i.e. if P 7^ PSPACE or not. 

The class BPP 

If probabilistic algorithms are considered, then corresponding probabilistic com- 
plexity classes can be defined. The hounded error probabilistic class, BBP, is 
defined to contain all languages L that can be decided by a probabilistic Turing 
machine M, such that 

• li X E L, then M accepts x with a probability at least 3/4. 

• li X ^ L, then M rejects x with a probability at least 3/4. 

^^Recently, such insight has indeed been gained, showing that primahty testing is in P after 

all m. 
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The probability 3/4 is arbitrary, any probability strictly greater than 1/2 
would suffice in the definition. 

There are many more complexity classes, as well as lots of inclusion relations 

between them. The reader is referred to the literature for a thorough discussion. 
We will briefly return to the topic in chapter 6 on the complexity of quantum 
computation. 
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Chapter 3 

Algebra of quantum bits 



There are a few different models of quantum computation in the hterature. The 
most popular, and most thoroughly worked out, is the quantum circuit model 
|25|. Quantum circuits are the quantum analogue of classical circuits built out 
of logic gates. Another model, the quantum Turing machine is the quantum 
analogue of the classical Turing machine. But just as general-purpose digital 
computers are not really built as Turing machines, it does not seem practical 
to build real quantum computers as quantum Turing machines. In this respect, 
quantum circuits seem to be closer to actual implementation as physical devices. 

This chapter is an introduction to the subject of quantum computation. 
The circuit model of computation introduced in chapter 2 will be elaborated 
and realized in terms of vectors and matrices, thus offering what could be called 
an algebra of bits and quantum bits. In this way we will be able to see precisely 
where the quantum paradigm of computation breaks away from the classical. 
Precise and general definitions of concepts, as well as a detailed treatment will 
follow in subsequent chapters. 

3.1 Classical and quantum physical systems 

In contrast to the case of classical theory of computation, the physical substra- 
tum of the computer is more focused in the research on quantum computation. 
In part this is due to the very real problems of actually building devices capable 
of performing quantum computations. It is appropriate therefore to begin with 
a short discussion of the concept of a physical system. 

A simple example of a classical system is a gas in container. Pressure, 
temperature, and volume give the macroscopic state of the gas. In classical 
physics, these variables can range over a continuous set of values corresponding 
to a continuous state space. The microscopic state is given by the values of all 
positions and all momenta of all the particles in the gas. This forms a huge 
continuous state space. 

In quantum physics, state spaces can be discrete or continuous or both. Con- 
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tinuous state spaces occur for free particles or particles scattered off a potential, 
whereas discrete state spaces occur for bound state systems, notably particles 
bound by a potential field. The standard example of a bound state system is 
the hydrogen atom which can be in a set of discrete states, each state being 
characterized by values of energy and a couple of other variables. In this case 
the energy can only range over a discrete set of values. If the atom absorbs 
enough energy, it will become ionized and the electron will no longer be bound 
by the potential of the nucleus. This corresponds to the continuous part of the 
state space. 

A gas of quantum particles in a container will have a huge discrete micro- 
scopic state space. There is no practical way to distinguish different states of 
such a system and it is useless for computational purposes. In order for a phys- 
ical system to be \iseful as a computer, it must be possible to exercise precise 
control over the states of the system. Typically, it must be possible to prepare 
the system in an input state, and then let the system evolve according to dy- 
namical laws (this corresponds to the program) and subsequently to measure 
an output state after the computation is completed. 

To conclude, in both classical and quantum physics one speaks of the state 
of a physical system, and the states are characterized by the values of a number 
of variables. The state spaces in classical physics are continuous. This is true 
also for systems like the bit. In solid state devices, voltage levels represent the 
two states of the bit and there is a certain range within which these voltage 
levels are allowed to vary. However, the levels must be well separated so that 
no overlap of the ranges occur. This ensures the discrete digital nature of the 
device. 

We will return to the physics of computing system in the two last chapters. 

3.2 Two-state quantum systems and the quan- 
tum bit 

The basic building block in most quantum computation models is the qubit. 
The qubit is a quantum generalization of the classical bit. A bit can be in any 
of the two well defined states and 1, and a classical memory register can be 
modeled by a string of bits. There is no interaction between the separate bits 
in the register. Information processing, or computation, can be regarded as bit 
flips performed on the register. Reading and writing single bits are the most 
primitive acts of computation. 

A qubit is a quantum system having two states. These states are denoted 
by |0) and These states arc the quantum versions of two states of the bit, 
and 1. The fundamental difference between classical and quantum physics is 
that whereas a classical system must be in a definite state, a quantum system 
can be in a superposition of a set of states. The bit must be cither or 1. But 
the qubit can be in a complex linear combination of |0) and namely 

^The notation will be explained later on. 
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|V) = «|0) + /3|1). 



(3.1) 



Here a and j3 are complex numbers and jV') is used to denote the general 
state. Precise definitions of the quantum mechanical notations will be given in 
the next two chapters. Suffice it here to note that the proper framework for this 
is complex linear vector spaces, and consequently, the states |0) and |1) can be 
thought of as basis states in such a space. 

The typical example of a two-state qiiantum system is the spin states of 
spin-^ particle like the electron, but the precise physical nature of the system 
will not concern us at the moment. We will instead develop the theory of 
quantum computation based on generic two-state quantum systems. Questions 
of practical implementations will be returned to in chapter 9. 

There is a certain restriction on the complex numbers a and /3 having to do 
with the interpretation of quantum mechanics. Classically, one can determine 
which state the bit is in, and one will get or 1 according to the actual state of 
the bit. Quantum mechanically, the situation is different. 

The process of obtaining information out of the qubit is called a m,easure- 
ment. If the qubit is in (or is known to be in) either the state |0) or the state 
a measurement performed on it will give the result or 1 respectively. If 
however the qubit is in the general state \^) , the measurement will give with 
probability \a\ and 1 with probability . There is no way, for a single qubit, 
to determine its precise state, i.e. there is no way to determine the values of 
a and /?. If however we have a large collection of identically prepared qubits, 
repeated measurements on the qubits will yield statistical values for |q;|^ and 
|/?|^. No single measurement can ever determine the values of a and (3. 

However, since a measurement must yield either or 1, this probabilistic 
interpretation gives the restriction 



on the numbers a and /3. 

A quantum measurement will have an effect on the state of the system after 
the measurement. If a measurement is performed on the general state IV') and 
the result is 1, the state will be |1) after the measurement. Likewise, if the 
result is 0, the state after the measurement will be |0). This is general property 
of measurements. 

There is a further difference with regard to classical physics. Classically it 
does not make sense to consider measuring the bit in any other state than or 
1 because there are no other states. However, quantum mechanically we can 
consider, for example, the special states 



|a|' + l/3|' = l 



(3.2) 




(3.3) 



(3.4) 
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Just as one can make a measurement on a general state with respect to 
the states |0) and |1), one can measure with respect to the states |+) and |— ). 
A qubit in the state |0), say, will when measured with respect to the states |+) 
and |— ), yield the result '+' with probability 1/2 and '— ' with probability 1/2. 

These non-classical features of the theory will be elaborated in the next two 
chapters on quantum mechanics. 

3.3 Multiple qubit states 

Multiple qubit states are modeled on their classical analogue, the bit strings. A 
classical two-bit register can store any of the bit strings 00, 01, 10 or 11. The 
quantum analogue of these strings are |00), |01), |10) and |11) respectively, and 
they can be regarded as a basis for a four-dimensional vector space. 

A general 2-qubit state can now be written as a linear combination of the 
basis states, 

IV) = aoolOO) + aoi|01) + aio|10) + aii|ll) 

Two facts can be noted at this stage. Firstly, as already noted for the single 
qubit, whereas a classical memory register must be in a definite state corre- 
sponding to the actual values of the stored bits, the quantum memory register 
can be in linear combination of all the basis states. This is referred to as super- 
position of states. Secondly, there are quantum states of the memory register 
that cannot be expressed as direct products of the basis states. One example 
is the state -^(100) -|- |11)) which in no way can be written as a product of 
single qubit states |0) and |1). This property of quantum mechanics is called 
entanglement. 

These features of quantum mechanics, superposition of states and entangle- 
ment, arc crucial to the theory of quantum computation. 

If we denote a single bit by 6, a classical n-bit string can be written as 
6162 ■ ■ - bn- The quantum analogue is \b1b2 ■ ■ ■ bn)- A general state is a linear 
combination of these 2" basis states. These states are called computational basis 
states. 

3.4 Computation 

Computation can be seen as a transformation of an input state to an output 
state. If both input and output are represented by n -bit strings, then the 
computation can be performed by applying an 2" x 2" matrix to the input. 

At first sight one might be tempted to use nx n matrices, representing the 
states by n -dimensional vectors, the components of which are taken to be the 
bits of the bit strings. However, that cannot work, as can be seen even in the 
simplest case of just one bit. Representing the input bit by the (onc-dimcnsional) 
vector i and the output by o, the computational relation connecting output and 
input is 
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i —I- o ^ a 

for some matrix C which in this case is just a number. But then the bit flip 
^ 1, 1 ^ cannot be represented with one and the same number C. Thus 
one bit of information must actuaUy be represented by a two dimensional space. 



Representations of classical bits and quantum bits 

We will introduce a convenient representation for both classical och quantum 
bits. The constructions are actually the same, but different notation will be used 
in order to highlight the differences between classical and quantum computation. 
The values for a classical bit will be denoted in boldface as and 1 and they 
will be represented as two-dimensional vectors as 







(3.5) 



Correspondingly, the quantum bits will be represented in terms of the same 
vectors as 



|0) = 



ID = 



(3.6) 



Note one difference in interpretation in this context. As in the preceding sec- 
tion, the quantum states |0) and |1) are basis vectors in a complex vector space 
and consequently it makes sense to consider linear combinations as in equation 
H3.1|l . For the classical bit values and 1 we introduce no such structure.^ 

Next, bit strings and multi-qubit states will be represented by direct products 
of these two-dimensional vectors. The rules are very simple, and we will write 
them out in the case of products of two and three vectors. 




(3.7) 



(3.8) 



aoboci 
aobiCQ 

ai J \0i J \ci J aiboco 

aiboci 
aibiCQ 
VaifeiCi / 

From these two cases, the principle should be clear. As an example, the four 
different two-bit strings can be represented by the vectors 

■^It could be somewhat artificially introduced in order to represent probabilistic computa- 
tion. Still, even so there are fundamental differences as compared to quantum computation. 
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The corresponding two-qubit states |00), |01), |10) and |11), have exactly the 
same vector representation. Following [221 I call these vectors the classical basis. 

When the numbers of bits or qubits are large it is convenient to use a short- 
hand notation using the base-10 representation of the bit strings interpreted as 
binary numbers. For example, the bit string 0101 will be denoted 64 and simi- 
larly for |0101) which is denoted by |5)4. The suffix is needed in order to remove 
any ambiguity as to the numbers of bits or qubits that the number represents. 



One-bit classical computations 

The bit-flip program, flip, can now be represented by the matrix 



FLIP 



1 

1 



which, of course, corresponds to the logical operation NOT. In the context of 
quantum computation, this matrix is also called X for reason that will become 
clear subsequently. 

Furthermore, there are three more distinct programs, namely for a one-bit 
state, namely 

.,0^0 ,0^1 ,0^0 
id: ^, set:^^^, reset : ^ q ■ 

All these can be represented by 2 x 2-matrices. Not that the two last compu- 
tations, set and reset, are not reversible, whereas the first two, not and id are 
reversible. 

Reversibility in this context means that the input can be deduced from the 
output. 



Two-bit classical computations 

On the 2-bit states 13.9|l . certain 4 x 4-matrices, represents computations. To 
take just one example, consider the operation of exchanging the values of the 
two bits. 

00 -> 00 

swap : -^Q^ Q^- (3-10) 
11 ^ 11 

A matrix effecting this transformation is 
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SWAP = 



/I 

















1 








1 








Vo 








1 



(3.11) 



Multi-bit classical computations 

In this view of computation, n -bit strings are represented by 2" dimensional 
vectors and computations are represented by 2" x 2" square matrices. The 
number of input and output bits are the same. 

Some restrictions on the aUowed matrices can be derived. The input bit 
string is represented by a column vector with just one 1 and the rest entries 0. 
The output bit string must also be represented in the same way. This puts a 
severe restriction on the possible matrices. Consider applying a certain matrix 
C to an n -bit state 



/Cll Cl2 
C21 C22 



Cln \ 

C2r. 



Cnl C„2 



Vo/ 



where the n-bit state is represented by a column vector with just on entry 5j 
different from and equal to 1. Carrying out the matrix multiplication yields 



C2j 



i.e. it pulls out the j-th column from the matrix C . If this column vector is to 
represent an n-bit state, only one of the coefhcients Cij can be equal to 1. 

This shows that the computation matrix C is a matrix of zeros, except for 
precisely one entry in each column which is equal to 1. As an aside, note that for 
n-bit computations, represented by 2" x 2" -matrices, there are (2")^ = 2"'^ 
different possible matrices. 

In the classical case working with this model of computation is very uneco- 
nomical as it requires exponential sized vectors and matrices, most of which 
entries are zero anyway. It corresponds to a unary notation for numbers. 

Note, however, the very close correspondence with the circuit model. In fact, 
this model is a realization of the circuit model, hence the name computational 
(or better classical ) basis for the vectors (|3.9n . 

In the classical case, we don't really need this expansion out of the bit strings 
into exponential sized vectors. There are much more efficient ways to process 
bit strings. 
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Treinsition to quantum computations 

The possibility to have Hnear combinations, called superpositions, of the basis 
vectors, marks the point of departure into quantum computations.^ A complex 
vector space is built on the basis vectors (which by the way, forms a normalized 
and orthogonal set). As an example, consider the case of 3-qubit states. Linear 
combinations can now be written nicely, 

7 

IV') = "^1^)3 = "o|0)3 + "ll^)3 + • • • + "717)3 = 

ao|000) + ai|001) + . . . + arllll). 

The general case is 

2"-l 

IV') = E (3.12) 

i=0 

The coefficients are normalized 

2"-l 

EKI' = 1- (3-13) 

i=Q 

Having introduced the complex numbers a into the theory, there is no reason 
to work with the very restrictive set of matrices used in classical computation. A 
priori, any matric C with complex entries could be contemplated as a candidate 
for a computation. There are however restrictions even in the quantum case 
that we will come to. But first, let us consider a few examples. 

One-bit quantum computations 

Consider first a one-qubit space. Define two new matrices Y and Z by 



Z=[l ^J. (3.15) 



-1 

Acting with Z on the 1-qubit basis vectors yields 



^At least in this approach to the theory. There are other ways to look at it. 



64 



Such a computation has no meaning classically. But quantum mechanically, 
the state (call it \(j>)) resulting from acting with Z on the state of (|3.1|l 



Z\ij) = Z{a\0) + (3\1)) = a\0) - = \^), 



is, so to speak, no worse than \tp) itself. 

If one hides the intermediate steps, this simple calculation Z\ip) — shows 
that Z transforms the state \ip) in one computational step. 

Let us also introduce one further matrix, of outmost importance in quantum 
computation, This is the so called Hadamard matrix 



This matrix is used build the linear superpositions of (|3.3|) and H3.4|l out of 
the basis states, or as can be checked by a simple calculation 



Multi qubit computations 

Generalizing to n-qubit states and 2" x 2" dimensional computational matrices, 
we have C\ip) = \(t>)- Thus an exponential number of classical computational 
steps are performed in parallel. 

If this is to be simulated on a classical computer, then of course, an exponen- 
tial number of operations have to be performed anyway, and nothing is gained 
as compared to performing the classical computations 

The situation is drastically changed if quantum devices can be built that 
actually performs the operation C on the state \Tp). 

Restriction on quantum computation matrices 

A general quantum computation can now be written as 



In a real quantum computer, this process of transforming the input state into 
the output state is actually a dynamical process that occurs in time, we speak of 
time evolution. Now it is a fundamental property of quantum mechanics that, 
if no measurements are made, the time evolution is reversible. This means that 
the input can be inferred from the output. This, and other requirements, puts 
a restriction on the allowed computational matrices C. Deriving this restriction 
demands tools that will be developed in the following two chapters. Here we 
just state the result. 

First, if we can find an inverse to C with the property C^^C = 1, then 
equation (|3.19() can be inverted by multiplying through by C~^, 




(3.16) 



H\0) = 1+) 
ff|l> = |-)- 



(3.17) 
(3.18) 



\out) — C\in). 



(3.19) 
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C-^\out) =C-^C\in) = |m), 

so that 

\in) = C~^\out). 

The requirement on the matrices C that makes it straight forward to invert 
them is that they must be unitary. This means that C is invertible and its 
inverse is equal to its conjugate transpose (C*)^ = or 

C'-^ = ct. 

Taking the conjugate transpose is a computationaUy cheap process of rear- 
ranging and complex conjugating the elements of the matrix. 
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Chapter 4 



Introduction to quantum 
mechanics 

Quantum mechanics is not a physical theory in itself, it is rather a framework in 
which physical theories must be formulated. If one takes a more fundamental, 
or philosophical point of view, quantum mechanics is a basic characteristic of 
reality which transcends all descriptions or theories of physical systems. It sets 
certain limits on what can known in principle about physical systems. The bare 
bones of quantum mechanics can be formulated as a few postulates which every 
quantum mechanical description of a physical system must conform to. 

In the community of physicists, opinions differs as to the proper philosophi- 
cal status of quantum mechanics. The majority view seems to be to take it as a 
fact of life, and since physical theories based on quantum mechanics in general 
agree very well with experiment, the only sensible thing to do is to go on and 
use it. There are features to quantum mechanics (for example the uncertainty 
principle and entanglement) that are considered to be counter intuitive from an 
everyday or classical physics perspective, but there is not a single experimen- 
tal fact contradicting quantum mechanics. Quite to the contrary, the theory is 
verified every day in physics laboratories around the world. Quantum mechan- 
ics has furthermore been corroborated during the last twenty years by special 
experiments testing the very foundations of the theory |29| . 

There are, however, and has always been, a strand of physicists uncom- 
fortable with quantum mechanics. For them the theory is, though in practice 
successful, in principle tentative, and eventually due to be replaced by a more 
satisfactory theory. The discussion goes back to the very beginning of quantum 
mechanics and in particular to the famous Bohr-Einstein debate. 

Some physicists maintain that (as Niels Bohr is reported to have said) that 
if you're not confused by quantum mechanics, then you haven't understood it, 
while others, especially younger physicists, can't understand what all the fuss 
is about. Clearly, this has more to do with ones own philosophical outlook than 
with the theory itself, and we will leave this discussion here. As this is not a 
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work on fundamental principles of natural philosophy, I will adopt the standard 
view that quantum mechanics is the proper framework for describing and un- 
derstanding physical systems, and that classical theories offers at best very good 
approximations. It should be kept in mind though, that there arc fundamental 
problems having to do with the relation between quantum mechanics and rela- 
tivity, especially general relativity and the theory of gravitation. This is not, at 
least not yet, of any importance to the theory of quantum computation. 

The term quantum physics thus refers to any physical system, or theory, for- 
mulated according to the postulates of quantum mechanics. The term classical 
physics on the other hand, refers to physics not formulated using quantum me- 
chanics. Examples of classical theories are classical mechanics (or Newtonian 
mechanics), relativity (both special and general) and classical electrodynam- 
ics. These classical theories are, as already noted, excellent approximations to 
physical phenomena that takes place on a macroscopic scale, and often even to 
microscopic phenomena. But in principle, physics is 'quantum'. 

Many physical theories come in both a classical and a quantum version and 
there are well defined procedures to pass between them. The procedure of going 
from classical to quantum is called quantization. Very often, quantum theories 
are formulated by first writing down a classical theory which is then quantized 
according to a set of heuristic rules. 

Obviously, in order to work on quantum computation you need some grasp on 
quantum mechanics. It is in fact not difficult to rapidly gather together the basic 
elements of quantum mechanics on a couple of pages, and this seems to be what 
most review articles does. Such a brief expose through the quantum mechanical 
toolbox tends however to rather dull, and I think, fairly incomprehensible if you 
haven't already studied the subject. 

I will adopt another strategy. Quantum mechanics will be introduced through 
a set of simple physical toy models. These will be the standard models that have 
traditionally proved their worth in physics education. In the course of working 
through the models, all relevant quantum mechanical concepts can be abstracted 
from these concrete models. We will recklessly assume that whats true in the 
particular case is true in the general case unless otherwise stated. Of course, 
such an approach is only useful in a first general introduction to a subject, and is 
not a substitute to proper study. I also think that this approach will be helpful 
when implementations of quantum computation in terms of physical devices are 
discussed briefly in chapter 9. 

The simple model systems we will consider are: 

• Particle in a potential box 

• Harmonic oscillator 

• Spin 

There are a few popular formulations of quantum theory. One of them uses 
configuration space wave functions and their conjugate momentum space Fourier 
transforms. This is one of the traditional formulations, originating with Erwin 
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Schrodinger, and it is traditionally called wave mechanics. Another formulation, 
contemporary with Scrodinger's, is Heisenberg's matrix mechanics. These are 
the original formulations of quantum mechanics and they were almost immedi- 
ately shown to be equivalent. In the context of a theoretical study of quantum 
computation another formulation, somewhat more abstract, and which can be 
considered as a generalization of the other formulations, due to P.A.M Dirac, is 
more appropriate. A very readable account of this formulation is Dirac's own 
classic book PU) . 

A fourth formulation, the path integral formulation , developed by R.P.N 
Feynman in the 1950's will not be mentioned here. It is of extreme useful- 
ness in modern theoretical physics, but its methods does not seem to be needed 
in quantum computation. 



4.1 Quantum mechanics in one space dimension 

As our introduction to quantum mechanics we will study a particle moving in one 
dimension of space under the influence of a potential. The state of the particle 
is described by the wave-function ip{x,t), where x is the space coordinate and 
t is the time. The states of a system can be described in different ways. This 
particular representation is called the configuration space representation, where 
configuration refers to using space to parameterize the state. The dynamics of 
the state is governed by the Schrodinger equation [32] 

th-^^t^ = H,p. (4.1) 

In this equation, S. is a physical constant which sets the scale of quantum 
phenomena. ^iJ is the Hamiltonian operator. The equation equates the time 
rate of change of the wave function with the action of the Hamiltonian, thus the 
dynamics of the state is encoded in the form oi H . In quantum computation, 
the 'program' of the quantum computer can be regarded as encoded in the 
Hamiltonian. But more on this later on. 

The Hamiltonian is related to the classical energy of the system. In classical 
physics, a particle has a mechanical energy consisting of kinetic energy K and 
potential energy V, and the total energy is E — K + V . The kinetic energy is 
given by 

where p is the particle momentum, classically related to the velocity v 
through p = mv where m is the particle mass. Thus, the kinetic energy can also 
be written as 



K 



^It's numerical value is 1.054 • 10 Js. 
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a formula perhaps more readily recognized by non-physicists. However, the 
first form is the fundamental one. 

The potential energy depends on the forces acting on the particle. Forces 
are not further analyzed in this context, and a formula is simply given for V. 
In general, it is a function of space and time, but we will only consider time- 
independent potentials. 

Quantization is performed via the heuristic rules 

replace x by x- 

d 

replace » by — ih—, 
ox 

or more concisely 



X — > X ■ (4.3) 
p -^n^^ (4.4) 

In these rules, the left hand sides should be thought of as classical physics 
entities, whereas the right hand sides stands for the corresponding quantum 
mechanical operators. An operator can be either multiplication by a function 
/• or a differential operator D (as in the second rule) acting on the state. ^ If this 
sounds confusing, this is not the proper time for worry. It is best just to carry 
on in order to get a little bit more used to the quantum mechanical machinery. 

If one applies these rules to the classical energy, one gets the Hamiltonian, 
or in formulas 

The Schrodinger equation now becomes 

This is a partial differential equation governing the time development of the 
system. This simple example captures most of the main features of quantum 
mechanics in this formulation. 

A more realistic system would be in three spatial dimensions. The force 

acting on the particle is given by the potential, examples of which could be 
the coulomb field from an atomic nucleus on an electron, forces from other 
electrons and perhaps time-dependent electromagnetic fields. But we will stick 
to this simple one-dimensional system and solve the equation in two cases; the 
square well potential and the harmonic oscillator. 

The first steps in the solution are general and does not depend on the form 
of the potential, except that it is assumed to be independent of time. The 



Other representations of quantum mechanical operators will appear subsequently. 
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method is the standard separation of variables method used in solving partial 
differential equations. We review it here in order to highlight the aspects that 
will be abstracted later on. 



4.1.1 Separation of space and time 

If the potential is time- independent, i.e. if V{x,t) — V{x), the Schrodinger 
equation can be simplified by separating the variables 

=^u„(x)/„(t). (4.7) 

n 

This is an ansatz for the solution, which can be justified referring to gen- 
eral theorems on partial differential equations. |35| . Here, {w„} and {/„} are 
enumerable infinite sets of functions. 

If the ansatz is inserted into the Schrodinger equation one gets 



dt 2m-'^ ' dx 

which upon division by u(x)f{t) yields 



^hu{x)^ = - — f{t)^^ + V{x)u{x)f{t), 



.\df I. d^u , 

Now, since the left hand side is independent of x and the right hand side is in- 
dependent of t, both sides must be equal to the same constant E. This constant 
is called a separation constant. We thus get two ordinary differential equations, 
one for the time-dependent function / and one for the space-dependent function 



= Ef (4.8) 

The first equation is easy to solve 

iEt 

f{t)^CeM—), (4.10) 

where C is a constant. 

The second equation is an eigenvalue equation of the Sturm-Liouville type, 
and its general form can be written abstractly as 

Hu Eu. 

The solutions to this equation will be precisely the functions Unix) in the 
expansion H4.7|l . They are referred to as eigenf unctions and the constants En 
as eigenvalues. 
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For special forms of the potential V (and appropriate boundary conditions), 
the equation defines well-known systems of orthonormal functions {um{x)} with 
the index m running over some infinite subset of Z. The general solution to the 
wave equation can then be written by inserting H4.10|l into H4.7|) 

^'(a;,*) = Unjx) exp{iEnt/h), 

n 

where the constant C has been hidden in the as yet undetermined functions 



4.1.2 Particle in a potential well 

With the general groundwork done, we will now turn to our first example, 
a quantum particle trapped within a container with impenetrable walls. In, 
reality there is no such thing, but it can be mimicked by choosing a potential 
of the form 

^^11''° (4.11) 
^ ^ \ 0, if |a;| < a ^ ^ 

with the walls at the locations x = —a and x = a. The impenetrability of 
the walls is modeled by the infinite value for the potential outside the well. This 
potential is often called a square well potential. 

Since the potential has three distinct regions, being discontinuous at x = ±a, 
the equation must be solved in the three regions separately. However, since in 
the two regions x < ~a and x > a, the potential is infinite, the function u must 
be equal to zero here. Furthermore, u itself must be continuous at the potential 
walls. This translates into boundary conditions for the solution in the region 
I a; I < a. Therefore, we get 



with boundary conditions 



u{a) = ui-a) ^ 0. (4.13) 



In this form, the boundary conditions are quite easy to understand. In 
classical physics, no particle can pass from as region with finite potential energy 
into a region with infinite potential energy, as that would require an infinite 
kinetic energy. This is true in quantum physics also. And since the u in some 
sense corresponds to the presence of the particle (in a way that will explained 
later), those boundary conditions corresponds to the impenetrable walls. 

This differential equation has the well known solution 

u{x) = A sin kx + B cos kx 



with k = y 2mE/'h'^ . 

Inserting the boundary conditions, we get the two equations 
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f Asmka + B cos ka = 

\ —Asinka + B coska = ' 

which reduce to 

f A sin ka = 
I i? cos ka = 

Since these sine and cosine expressions cannot be simultaneously zero, we 
get the solutions 

( A = and cos ka = Q=> ka = n'jT/2 with n odd 
1^ -B = and sin fca = fca = n7r/2 with n even ' 

In fact it suffices to restrict the solutions to non-negative integers. The 
reason is that the solutions for negative integers are not linearly independent of 
the solutions with positive integers. This is apparent from the explicit form of 
the solutions 

n„(x) = (^^."'/"""/2t . (4.14) 

Atim{n7rx/ Za), n even 

Changing n to —n in these formulas have no effect for the cosine solutions, and 
for the sine solutions, the ensuing change of sign for sine can be absorbed into 

the constant A. 

The constants A and B are normalization constants to be determined by the 
normalization condition 



J —t 



Un{x)* Un{x)dx = 1 (4-15) 



where '*' denotes complex conjugation. Clearly, some normalization of the 
solutions is needed, and this particular one is related to the interpretation of 
quantum mechanics where the wave functions il^{x) are interpreted as probability 
amplitudes. To say that is a probability amplitude is to say that the integral 



i: 



ip{x)*'il){x)dx 



is the probability of detecting the particle in the interval (c, d) . 
In fact, an even stronger property of our solutions can be inferred: 

Un{x)*Um{x)dx = 6nm = I n' 7 • (4-16) 

This equation expresses the orthonormality of the solutions, i.e the solutions 
are normalized and solutions with different index are orthogonal. This is a 
general property of solutions to eigenvalue problems. The general theory will 
be spelt out in chapter 5. 
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The solutions 7i„ (x) can now be considered to form a basis of a linear vector 
space. Any solution to the wave equation can be expressed as a linear combi- 
nation (or superposition) of the basis functions 

oo 

ipix) = a-nUn{x), (4.17) 

where a„ are complex numbers. These numbers are arbitrary apart from a 
global normalization. Since the total probability of detecting the particle in the 
box must be 1, we get 

/a °° 
ip{x)*ij{x)dx = a>„ = 1 (4.18) 
-a (, 

This follows from a nice calculation involving sevca'al of the formulas given 
in this section and some trigonometry. Performing this calculation gives quite 
a lot of insight into the mathematics of quantum mechanics. 

Since we now know k, we can get a formula for the separation constant E, 
which is to be interpreted as the energy of the system 

In this way we get quantization of the energy. The energy can only take 
values determined by the integer n. This is symbolized by indexing the E with 
n. 

But why is i?„ energy? This can be understood by going back to the classical 
equation for energy in terms of kinetic and potential energy E = K + V. In our 
case, the potential energy is zero inside the box, and we have simply E = K. 
Then using the quantization rules we got 

E^H = ^{-in^f. 

2m^ dx' 

If this Hamiltonian operator is applied to any of the solutions, the result is 

Hun{x) = ^{-^n^fBcosC^) = 
2m ox 2a 

In the next to last expression, we recognize the separation constant (eigen- 
value) En- Thus, the physical interpretation of the eigenvalue equation 

HUn{x) = EnU„{x), 

is that the eigenvalues corresponding to the Hamiltonian operator H are 
the energies of the accessible states for the system. This makes sense, since 
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the Hamiltonian itself is the quantum operator corresponding to the classical 
energy. 

We have thus seen that the states of the system form an enumerable infinite 
set. The first stage of abstraction in quantum mechanics is to note that, al- 
though we might need special properties of these solutions, it is not be necessary 
to work with the explicit representation in terms of sine and cosine functions. 
Our next example system, the harmonic oscillator, will illustrate this. 

4.2 Linear harmonic oscillator 

The harmonic oscillator is a simple and extremely useful model of physical sys- 
tems both in classical physics and quantum physics. Classically, it can be used 
to model mechanical vibrations. In quantum physics it is model for the modes of 
electromagnetic waves. Its usefulness stems from the fact that even complicated 
many-particle systems or continuous media can often be analyzed in terms of 
normal modes of vibration, perhaps after linearization, and furthermore, that 
it is a completely solvable model. 

The model is also very useful in that it has enough features in order to 
develop large portions of elementary classical and quantum mechanics within 
it. This is precisely what will be done in this section. 

The simple classical harmonic oscillator consists of a particle connected with 
a spring to rigid wall. The force from the spring is proportional to the displace- 
ment of the spring from its natural length. If other forces like air-resistance and 
friction are neglected, the particle will oscillate forever once it is set in motion. 
During this oscillation, there will be an oscillation of energy between kinetic en- 
ergy of motion and potential energy in the spring. The total energy is constant 
during the motion. The total energy is said to be a constant of the motion. 

When setting up a model for this system, physicists normally abstract away 
from the wall, instead considering a particle of mass m attracted to a fixed 
center by a force that is proportional to the displacement from the center. This 
gives a more symmetrical formulation, and the displacement can take negative 
values. Letting x denote the displacement from the center, the force acting on 
the particle is F = —kx. The negative sign makes the force attracting. The 
equilibrium position is in the center where a; = 0. The so called conserva- 
tive forces, i.e forces that conserves total energy, can always be derived from a 
potential by the relation 

It is thus easy to see that the potential energy for a harmonic oscillator is 

V{x) = ^kx^. 
The total classical energy of the oscillator is 
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E = K^V =— + -kx^. (4.20) 
2m 2 

Since, as noted, E is a, constant during the motion, we can read off the 
oscillation of the energy between kinetic energy and potential energy. When 
X = 0, corresponding to the particle passing through the center of motion, all 
energy is kinetic and the momentum p takes its maximum value. On the other 
hand, when p = Q, the kinetic energy is zero and all energy is potential. This 
corresponds to the turning points at the maximum distance from the center. 

We will not analyze the classical model further, but turn directly to the 
quantum harmonic oscillator. 



4.2.1 Quantization of the oscillator 

Referring back to the quantization rules of the previous section we can now 
easily write down the Hamiltonian for the harmonic oscillator. Applying the 
quantization rules to the energy H4.2()|l we get 

2m OX'' 2 

The Schrodinger equation becomes 

The separation of variables and the solution for the time-dependence, i.e. the 
steps recorded in section 4.2.1, are exactly the same for the harmonic oscillator. 
We need only concentrate on the space-dependence. In fact, this is true for all 
systems for which the forces are time-independent. Thus, after separation of 
the variables we always end up with equation H4.9|l of the previous section, with 
the appropriate potential.'^ Therefore, the eigenvalue equation to solve is 

1 

-—^u+-kx^u^Eu. (4.22) 

2m ox'^ 2 

At this stage, one can proceed as in the previous section, and solve this 
differential equation to obtain an infinite set of basis functions. The set of 
orthonormal basis functions can be written as 

Un{x) = NnHn{ax)exp{-]^a^x^). 

Here, a is a constant and is a normalization constant 

/ mk , a ,1 
h V7r2"n! 



■^In three dimensions of space, and in other coordinate systems than rectangular, the equa- 
tion is more complicated - but in principle it is always the same equation with the relevant 
potential. 
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and Hn are Hermite polynomials, the first few of which are 

Ho{x) = l, Hi{x)^2x, H2{x)^Ax'^ -2. 

We will not derive this solution to the harmonic oscillator.^ Instead we 
will introduce a more abstract, and more powerful formalism, which is also 
the standard formalism used in almost all applications of harmonic oscillators. 
This is the method of annihilation and creation operators and in the process of 
introducing them we will also introduce the very useful notation on bra and ket 
vectors invented by Dirac. 



4.2.2 Operators for momentum and position 

One step in the quantization of a classical system is to replace classical dynam- 
ical variables with operators. Momentum p and position x are replaced by the 
momentum operator and the position operator respectively, often denoted by p 
and x.^ Explicit representations of these operators are 



p = (4.23) 
i = X, (4.24) 

which is the same representation of the operators as in equations (|4.3(l and H4.4|l . 
This particular representation is valid in the configuration space representation 
of the states using x-space wave-functions. Note that momentum is represented 
by a differential operator, and position by a multiplication. A consequence 
of this is that the order of application of the operators matters. A simple 
calculation illustrates this. Consider applying first x and then p to a state if), 
and then applying these operators in the reverse order, i.e. first p and then x: 

d / \ d , d ^ 

pxTp — —ih—lxip) = —ihtp — iUx—Tp = + x—)ip, 

ox^ ' ax ox' 

xpip = —ikx—ip. 

ox 

Then subtract these expressions to get 

{xp — px)^j = iTitp. 

The state "0 is completely arbitrary here, and can be removed, yielding the 
operator equation 



xp — px — ih. (4-25) 

*Any standard textbook on quantum mechanics contains the caIculations f27l ?]. 
^Pronounced p-hat and x-hat. 
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This combination of operators is so important in quantum mechanics and 
so frequently occuring that a special notation is introduced. A commutator 
bracket, or simply a commutator, between two operators A and B is defined by 



[A, B]^ AB- BA. (4.26) 
Using this notation, equation 14.2511 can be written 

[x,p\^in. (4.27) 

This is a fundamental equation relating the operators x and p and it holds 
whatever representation is used. Therefore it can be used as a quantization 
condition. 

For completeness, we also record the trivial commutators 



[i,i]=0 (4.28) 
[p,p]-0. (4.29) 

In general, an operator always commutes with itself. 
4.2.3 Commutators 

If the commutator between two operators is non-zero, i.e if [A, B] ^ 0, the 
operators are said to be non-commuting. In that case, as we have seen, the 
order in which they are applied to a quantum state matters. Sometimes, when 
there is no particular ordering to prefer, or when the ordering is ambiguous, a 
symmetrical ordering is chosen as a kind of default ordering 



{AB),y^^-{AB + BA). (4.30) 



4.2.4 A note on classical dynamics 

Classical dynamics for a particle in one dimension of space is governed by 
Newton's equation 

ma = F, (4.31) 

where a denotes the acceleration. Acceleration is the time derivative of 
velocity, 

dv 

which in its turn is related to the particle momentum through the equation 

v=—. (4.32) 

TO 

The force F is determined by the potential V 
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dx 

Combining these four equations while assuming that the mass is constant 
(the normal case), yields 

dp dV 
dt dx 

Since the kinetic energy K does not depend on x, it is possible to replace 
the potential energy V in this formula with the total energy E. It is practical to 
change notation slightly and use H for the total energy also in the classical case, 
reserving E for the quantum mechanical energy eigenvalues. Thus we write for 
the total energy H = K + V . This yields the dynamical equation 

dp dH / , „ . N 

-r = 4-34) 

dt dx ^ ^ 

At this stage one might worry that the original Newton equation H4.81|l is 
a second order differential equation (remember, the acceleration a is a second 
order derivative with respect to time), and 1)4.34(1 is a first order differential 
equation. Something is missing. 

The missing ingredient is precisely the equation (|4.32() relating velocity and 
momentum, or rather this equation rewritten so that it relates velocity to the 
momentum derivative of the energy. Differentiating H with respect to p yields 

dH _dK _ d p^ _ p 
dp dp dp 2m m 
But this is precisely the velocity v — dx/dt, so we get 

dp dt 

These two equations, (|4.34f) and (|4.35|) . are the fundamental dynamical equa- 
tions of classical mechanics. This reformulation of Newtonian mechanics was 
performed during the 18"^ and 19**^ centuries by Euler, Lagrange, Hamilton and 
Poisson. It is quite general and it is the formulation of classical mechanics in 
which the translation to quantum mechanics is most easily performed. In the 
general theory, where there might be more than one particle, one considers a 
dynamical system described by a set of dynamical variables {{xi,pi)}.^ The 
energy, or Hamiltonian is function of these variables. Then the Hamiltonian 
equations of motion become 

dx4 dH , , „„N 

-r- = ^ (4-36) 
dt dpi ^ ' 

dpi dH 

dt dxi 



^These variables can be more general than ordinary position and momentum, including for 
example angular variables. 
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Note how beautiful and symmetrical these equations are! However, they 
can still be rewritten in an even more compact way in which the transition to 
quantum mechanics is most easily performed. In order to do that, consider a 
dynamical system with coordinates and momenta {(xi^pi)} where the indices 
ranges from 1 to n, n being the number of particles of the system. Let A and 
B be two dynamical variables, both differentiable functions of the x's and p's. 
The Poisson bracket is defined by 

{^'^} = Efea^"^^)- (4-38) 

As an example, calculate the brackets {x, H} {p, H} for a one-dimensional 
system, i.e n = 1. 

{x,H} 



dx dH 


dH dx 


dH 


dx dp 


dx dp 


dp 


dpdH 


dH dp 


dH 


dx dp 


dx dp 


dx 



{V.H} = 

Combining these equations with H4.36|) . H4.37|) and generalizing to n particles, 
the Hamiltonian equations of motion can be written compactly 



^^{x,.H} (4.39) 

^ = {P.,H}. (4.40) 

In these equations, coordinates (positions) and momenta appear completely 
symmetrically. 

The reader might feel that an edifice (a general theory of dynamics) has been 
erected on a very tiny base (the harmonic oscillator). That is indeed true. But 
everything in this section can be developed quite rigorously from fundamental 
principles of physics. A good accessible reference is As a last point, it 

can be shown that the equation of motion for a function F of the dynamical 
variables can be written 

dF dF 

The second term dF/dt, vanishes if there is no explicit time dependence for 
F. This is often the case. 

4.2.5 Quantization 

Hamilton's equations for classical dynamics offers a very natural starting point 
for quantization. The rules are 
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1. Replace classical dynamical variables A with the corresponding quantum 
operators A 

A — >A 

2. Replace the Poisson brackets {•, •} with commutators [•, •] 

{A,B}-^^^[A,B]. 
Upon these replacements, the equations of motion for the operators become 

dF 1 . . 

^= [F,H]. (4.42) 

This form of quantum dynamics bears no obvious relation to the Schrodinger 
equation. The states does not even appear here. But in fact, there is a close 
relation and it will be explained in chapter 5. Sufhcc it here to say the present 
form of dynamics, which goes under the name Heisenberg picture, the time- 
development of the system is carried by the operators while the states are 
time-independent. That is the reason why they don't appear explicitly. In 
the Schrodinger equation approach, called the Schrodinger picture, the states 
carry the time-development while the operators are time- independent. 

4.2.6 Dirac notation, a case of abstraction 

The Dirac notation is a very useful way of formulating quantum mechanics 
in that it allows quantum mechanical systems to be treated in a uniform way 
by abstracting away from particularities. In fact, a computer scientist might 
want to regard the Dirac formalism as providing an interface, specifying what 
properties and methods a system should support without entering into details 
on how to implement them. 

Suppose a certain system is described by a wave function ipix) expanded as 
in (|5T'^ 

OO 

= ^ a„u„(x). 

n=0 

Now, the basis functions u„ belong to a certain class of functions, all sharing 
common properties. Often it is just those properties that are important, not the 
explicit x-space representation. Letting n label these properties, we can write 
the expansion in the form 

oo 
n=0 

where the fcet-notation |-) was introduced by Dirac to denote abstract quan- 
tum states. 

Instead of having physical quantities represented as explicit differential op- 
erators acting on the wave functions, those quantities are now represented as 
abstract operators acting on the label n. 
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4.2.7 Summary of the classical harmonic oscillator 

Returning now to the harmonic oscillator, we can write the dynamical equations 
compactly using the general formalism just developed. The Hamiltonian is 

With this particular form for the Hamiltonian, equations l|4.39|l and ()4.40|l yield 

dx kx^ p 

at Zm 2 Zm m 

Together these equations give 

£x 
ra—rr = —kx 

which is Newton's equation for a harmonic oscillator. To see this, note that 
the first equation gives p = mv and insert this formula for p in the second 
equation. 

4.2.8 Creation and annihilation operators 

The groundwork is now laid for treating the harmonic oscillator using creation 
and annihilation operators. First, the classical Hamiltonian is rewritten in a 
suggestive way 



2^m > 2\ m^y^ 

Introducing lo ~ y^k/m, the Hamiltonian H becomes^. 

H ^-uj(^ +mLux^). (4.43) 
2 muj 

In anticipation of quantum mechanics, introduce h and do a further rewriting 

H=hc.(J^ + '^x')^ 
2 mujn n 



1 / p . moj \ / p . muj ^ 

2'"^7="V~"^^v= + V— 

— hujzz, 
2 

or for short 



^a) is related to the classical frequency of oscillations / through lu = 27r/. A useful formula 
is muj = yjkm 



82 



H = -hujzz. (4.44) 

The intuition behind this rewriting is that H is a quadratic form, and there- 
fore it should be possible to write it as a square. However, since x and p will 
become operators, care must be exercised regarding the order in which they 
appear in the product zz. From now on, x and p are treated as operators, but 
we will drop the 'hat' notation, writing A for A. 

In order to capitalize on the possibility to write iJ as a square, define creation 
and annihilation operators a and 



a = (j) — imojx) (4.45) 

\/2h'muj 

— — - {p + imiiiT.) (4.46) 
\/2hmLu 

The reason for giving them these, somewhat esoteric, names will become 
clear subsequently. Comparing these definitions with (|4.44(l suggests taking 
z = \f2a and z = \/2a^ . However, since there is no reason to choose a par- 
ticular ordering of the operators, a symmetric ordering will be used. Thus the 
Hamiltonian is written 

Inserting (|4.45|) and 14.46|l . and performing some careful algebra, yields 



H = —hi,)(^ — )^({p ~ imujx^(yP + im.ujx) 
2 \l 9.fiTnn) V 



(p + imojx) (p — imojx^^ 
{p^ + imujpx — imujxp + m^uj^x^ + p^ — imojpx + imujxp + rn^uj^x^) 



1^2, 2 2 2\ 
— {p +171 UJ X ), 

2m 



4m 



which is the same formula as (|4.43l) slightly rearranged. 

Note that the operator combinations xp and px cancel in the above calcu- 
lation. They would not have done that, had not a symmetrical ordering been 
chosen. 

So far, not very much has been achieved. In order to proceed, some prop- 
erties of the creation and annihilation operators must be derived. In quantum 
mechanics, the commutators between operators are always important because 
much of the properties of a system are encoded into the commutators. We there- 
fore calculate the commutator [a, a^] using the definitions H4.45|l and (|4.4t)l) and 
the basic commutators (|4.27|l - (|4.29|l 
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This implies that aa^ = a^a + 1 and the Hamiltonian can be written as 

H = nuj{a''a+^). (4.48) 

The commutation relations for the creation and annihilation operators can 
now be summarized 



[a, flt] = 1 (4.49) 
[a,a]^[a\a^]=0. (4.50) 

So far no reference has been made to the states of the harmonic oscillator. It 
is time to introduce them now. Referring back to equation H4.43|l we see that H 
is a positive definite operator (the energy is positive classically), and therefore, 
on physical grounds, there must be a state with lowest energy. Denote this 
ground state with |0). In computing the energy for this state, we must know the 
effect of the creation and annihilation operators acting on it. We will choose 

a|0) = 0. (4.51) 

The intuition behind this choice is that the ground state, being the lowest 
energy state, must be annihilated by the annihilation operator, but ultimately 
it is justified by the results that follow. The energy of the ground state can now 
be computed 

H\0)^hcj{a^a+^m^^\0). 

If there is a ground state, there ought to be excited states, i.e states with 
higher energy. As the terminology suggests, the next excited state above the 
ground state is created by the creation operator acting on the ground state. 
Denoting this state with let us define tentatively 

|l)=at|0). 

Now, there is a consistency requirement on these equations. Since [a, a^] = 1 
it must be the case that 

[a,at]|0) = l|0> = |0). 
But this can now be checked explicitly 

|0) = [a,at]|0) = aa^O) ^a''a\0) = a\l). 
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Thus it must be the case that 

a|l)=|0), 

the interpretation of which is that the first excited state is destroyed, or anni- 
hilated, by a. 

Clearly, it must be possible to generalize this and construct an hierarchy of 
excited states by letting the creation operator act on the ground state repeatedly. 
The excited state \n) ought to be the ground state acted on by n creation 
operators, or writing a recursive definition 

a^|n) = + 1), for n > 0, (4.52) 

where ^(n) is an as yet undetermined normalization. This equation agrees with 
the previous formula for |1) when n = if we demand ^(0) = 1. 

On the other hand, acting with the annihilation operator on the state \n) 
should yield the state |n — 1) 

a\n) = r]{n)\n - 1) . (4.53) 

A careful analysis yields the coefficients £,{n) = \/n + 1 and 7y(n) = ^/^rl, so 
that we have 

aV> = Vn + l\n + 1) (4.54) 
a\n) ^ \n ~ 1) . (4.55) 

Equation l|4.54|l record the action of as a raising operator, its action on 
a state is to increase the quantum number n by 1. Likewise, Equation H4.55|l 
record the action of a as a lowering operator, its action on a state is to decrease 
the quantum number n by 1. 

The name creation operator originates in the quantum theory of the elec- 
tromagnetic field. The frequency modes of an electromagnetic field can be 
described by harmonic oscillators. In a quantum description of the electromag- 
netic field, a mode with frequency / is corresponds to a harmonic oscillator 
with u! = 27r/. The intensity of the field corresponds to the number of photons 
in the mode, and the number of photons is precisely the quantum number n 
of the harmonic oscillator. As will be shown below, the energy in the mode is 
hLu{n + i). In this context, the action of the creation operator is to create a 
new photon in the frequency mode. Correspondingly, the action of the lowering 
or annihilation operator a is to decrease the quantum number n or annihilate a 
photon in the mode. 

Next, it follows that a^a is a number operator, counting the excitation level 
of the state. Using the equations (|4.54(l and H4.55|l we get 

a^a|n) — a'' y/n\n — 1) = ^/na'ln — 1) = \/n\/n\n) ~ n\ri) 

In these calculations we are using the fact that numbers commutes with 
operators, i.e. the order can be interchanged freely. 
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Sometimes the number operator is denoted by TV, writing A'^ = a'a. Not 
surprisingly, the states \n) are eigenstates of the number operator, or 

N\n) = n\n). (4.56) 

The states \n) arc furthermore eigenstates of the Hamihonian since it can 
be written in terms of the number operator as H = hoj{N + 5), 

H\n) = Tiij{N + ^)\n) = nuj{n + ^)|n). (4.57) 

From this equation we can read of the energy spectrum of the linear harmonic 
oscillator, 

£„ = 7iw(n + ^). (4.58) 

We have the following set of formulas, summarizing this algebraic treatment 
of the harmonic oscillator 

' [a, at] = 1 

a|0) = 
< a^\n) =^/n+l\n + 1) • 

a\n) = ^/n\n — 1) 
, a^a\n) = n\n) 

Admittedly there is quite a lot of handwaving going into this 'derivation' of 
the harmonic oscillator spectrum, but this has more to do with the pedestrian 
approach of this section, than to the method as such. This algebraic approach 
to solving the harmonic oscillator can be made more rigorous. But it is clearly 
more abstract than the wave-function approach. The reader might wonder what 
is the actual content of an abstract equation such as a^\n) = \/n + 1 |n). 

One, computer science oriented, way of thinking of equations like these is 
to regard them purely syntactically. Then, wherever we see the combination of 
symbols a^jn) we are allowed to replace them by \/n + 1 |n). Of course, quite a 
few more rules are needed in order to 'calculate' syntactically with this model, 
but in principle, the whole theory can be phrased entirely abstractly in terms of 
formal syntactic rules. Semantics can be added to the model by interpreting the 
operators and the states. On such interpret ion is in terms of configuration space, 
derivatives and wave functions. Another one is in terms of (infinite dimensional) 
matrices and vectors. This can also be regarded as making a distinction between 
interface and implementation. Seen in this way, the equation a^\n ) = Vn + 1 1 n) 
belongs in the interface, being effectively a specification of a functionality to be 
provided by the implementation. The implementation then, could be in terms 
of wave functions or in terms of matrices, or perhaps in terms of some other for 
the purpose suitable (mathematical) constructs. In the next section on angular 
momentum and spin, we will see a concrete example of this. 
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Derivation of the normalization conditions 

This section is somewhat technical, and do involve certain concept not yet dis- 
cussed. The purpose is to derive the normahzation coefficients ^(n) and ri{n). 
The reader might want to skip it for now and return after reading chapter 5. 

The coefficients ^ and 77 are subject to some consistency conditions. First, 
since [a, a^] = 1, we have 

[a, a'] \n) ~ \n) =^ 

^{n)r]{n + l)-S,{n-l)T]{n) = l. (4.59) 

Furthermore, the states |n) are subject to a orthonormality condition, anal- 
ogous to (??) 

{n\m) = Snm, 

and in particular 

(7i|n) = 1. 

The question arises, what is A detailed explanation of this will be given 
i chapter 5. Here we can think of (n| as a form of conjugate to \n). As regards 
the equation 14.52II . this conjugation, denoted by a dagger f, works as follows 

(at|n))t = ($(n)|n + l»t ^ 

{n\a = {n + l\£,{ny , (4.60) 
and as regards equation l|4.53|l 

(a|n))t = (77(„)|n-l))t^ 

{n\a^ ^ {n - l\rj{n)* (4.61) 

Enforcing the condition (n|n) = 1 on the state that is {n+l\n+l) = 1 

and using H4.52|l and H4.53|l as well as H4.60|l and (|4.61|) yields 

{n + l\n + l) = I {n\aa^n) = 
^^-i^(n|[a,at] -fatain) = + r;(n),y(n)*|n) = 

where the common rewriting trick aa^ — [a, a^] + a^'a has been used. 
Concluding, we get 

— i-— (l + ry(n)r/(n)*) = 1, 
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or more succinctly 

^HCW* - r]{n)r]{n)* ^ 1 (4.62) 

Before trying to solve equations (|4.59|) and (|4.t)2l) we will make the simpli- 
fying assumption that ^ and 77 are real. This assumption can be justified after 
the fact. So equation (|4.62() becomes 

^{n)^ - r]{nf = I. (4.63) 

The solution to (|4.59|l and l|4.63|l can now be constructed in an inductive 
way. Starting from a|0) =0 which implies 

r?(0) = 

we find that l|4.59|l implies 

^(0) = 1, 

which in its turn using H4.63|l implies 

77(1) = 1. 

Going on in this way, using equations (|4.59|) and H4.63|l . we get the sequence 
of equations 

e(l) = V2 
?y(2) = V2 
e(2) = V3 
ry(3) = V3 

or in general 

C(n) Vn+1 
ri{n) = ^/n. 

It is easy to prove these equations using mathematical induction on equations 
ij?'^ and 

A computer scientist cannot fail to register how closely this construction 
of the state space of the harmonic oscillator runs to the construction of the 
natural numbers as an inductive set. There are however important differences. 
This issue will be explored elsewhere. 
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4.3 Angular momentum and spin 



In classical physics, angular momentum is a quantity related to rotation. So 
for example, a particle of mass m, rotating with the velocity w in a circle at a 
distance r from a center, has an angular momentum L given by 

L = mvr. 

This, however, is an oversimplification. Since the circle of rotation lies in a 
certain plane, and the velocity w is a vector (the direction of which is changing 
as the particle moves in the circle), it turns out that angular momentum must 
be described by a vector quantity L. In the simple case of a particle rotating 
with constant velocity w in a circle, the direction of the vector L coincides with 
a vector normal to the plane of the circle. 

The angular momentum, being a vector, can be written in terms of its com- 
ponents {Lx, Ly, Lz) in a rectangular coordinate system. Classically, all of the 
components of L can be determined to arbitrary precision. 

When angular momentum for atomic quantum systems such as the hydrogen 
atom was studied, it turned out that the situation was radically different. Since 
atomic systems contain rotating components such as electrons, angular momen- 
tum played a central role in the development of quantum mechanics. Without 
going into either the history of the subject,^ or the detailed theory, let us just 
record the basic facts. 

In quantum mechanics the components of the angular momentum become 
hermitcan operators [L^, Ly, L^)- Having said that, we will at once drop the 
'hats' over the operators. The order of application of these operators on a 
quantum state matters. This is recorded in the commutation relations 

[Lx,Ly] = ihLz, [Ly, Lz] = ihLx, [Lz.L^] = iTiLy. (4.64) 

The physical consequence of this is that not all three components of L are 
measurable simultaneously. Instead one normally considers the length square 
of L 

l.^ = Ll + Ll + Ll (4.65) 

This quantity does commute with all components of L as can be shown 
by simply carrying out the commutator algebra, using H4.64|l . Therefore, in 
order to specify simultaneously measurable quantities for a rotating system, 
one normally chooses and one component of L, the standard choice being 
the z-component Lz- 

This is all very abstract, and in a concrete case, as for example when treating 
the electron in a hydrogen atom, these operators are represented as differential 
operators acting on the configuration space wave-function of the electron. Such a 
concrete analysis shows that the angular momentum states can be characterized 

*A good reference treating the history of atomic, nuclear and elementary particle physics 
isEl- 
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by two quantum numbers I and m related to the eigenvalues of the operators 
and Lz- This involves quite a lot of long-winded calculations, and we will not 
perform them here. They can be found in any textbook on quantum mechanics. 
The type of calculations are similar to those that we reviewed in the case of the 
particle in a potential box, essentially solving the Scrodinger equation in three 
dimensions of space. 

Instead we will work in a more abstract way. The constant h setting the 
scale of quantum phenomena, plays no significant role in the following algebraic 
treatment of angular momentum, so we will rescale it to 1, or to put it differently, 
we will choose units of measurement where h — 1.^ 

The angular momentum eigenstates will be denoted hy \l, m) where m is the 
eigenvalue corresponding to the z-component of L, i.e. 

Lz\l,m) ^m\l,m). (4.66) 

Since angular momentum is described by two commuting operators, it makes 
sense to label the eigenstates with two labels. As said, I is a quantum number 
in some way related to the length of the angular momentum vector L, but the 
exact correspondence is left open at the moment. We will work with a fixed 
value for I. 

The states are orthonormal 

(;,m|;,n) = <5™„. (4.67) 

Some further properties of this representation will now be derived. It will 
become apparent that this can be done in rather close analogy with the way in 
which the harmonic oscillator was treated. In that case the operators x and p 
was replaced by the creation and annihilation operators. The creation operator 
increases the quantum number n (acting like a kind of successor) whereas the 
annihilation operator, decreases the quantum number (like a predecessor). The 
first step will be to introduce new operators L_|_ and L_ with properties similar 
to creation and annihilation operators with the difference that now we will get 
a finite spectrum. 



L+ = L,+ iLy (4.68) 
L_^L,~ILy. (4.69) 

Next, the commutation relations H4.64(l . are rewritten in terms of these op- 
erators and Lz 

[Lz,L+] = L+, [Lz,L-] = [L+,L-] = (4.70) 

^This is standard practice in quantum mechanics. It is actually possible to reinstate h 
in the formulas at a later stage via so called dimensional analysis. Another way to look at 
this is to rescale the operators, or absorb factors of ^h. This is actually what we did in the 
preceding section when defining the creation and annihilation operators. 
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To understand the action of the operator Lj^^ perform the following calcu- 
lation 

L,{L+\lm)) = {[L,,L+]+L+L,)\l,m) = 
(L-|_ + mL-^-^ |Z, to) — (to + 1)L_|_|^, m). 

This is a calculation of the Lz-eigenvalue for the state L+\l,m) and it 
shows that as compared to the state \l,m), the state L+|I,to) has L+|Z,to) 
eigenvalue m + 1, i.e. the eigenvalue is one unit larger. In the same way, it can 
be shown that the state L-\l,m) has Lz-eigenvalue to — 1 as compared to the 
state \l,m). For this reason, the operators L+ and L_ are called raising and 
lowering operators respectively. It now remains to calculate the spectrum of 
eigenstates for and L^. 

The intuition is the following. We have a physical system with a certain 
angular momentum, somehow parameterized by the quantum number The 
quantum number to corresponds to the z-component. Obviously, if the total 
angular momentum has a finite value, the z-component must also be bounded. 
Even classically, no component of L can be larger than L itself. Then, thinking 
of as an operator that increases z-component of the angular momentum, it 
makes sense to postulate the existence of a state with the highest possible value 
for TO, denoted by |/, that is annihilated by i+, or 

L+\l,l)=Q. (4.71) 

This equation plays a similar role to that of equation (|4.51|l for the harmonic 

oscillator as providing a base case for building the spectrum of states. The state 

\ItI) is called the highest weight state. 

Acting on the state |Z, I) with the lowering operator L_ should yield the state 
— 1), having one unit lower m- value. It is perhaps tempting then to assume 

that L-\l^l) — — 1), but that is wrong. There are normalization factors to 

take into account. Instead, a detailed analysis will show 

L_|?,to) = C(m)|/,m- 1) (4.72) 

as well as 

L+|?,to) =C(m + l)l',"i + l) (4.73) 
with normalization factors C("^)- These factors have the following form 

C(m) = ^/l{l + l)-m{m-l). (4.74) 

Applying repeatedly, we must at some stage arrive at a state with low- 
est possible value for the Lz-eigenvalue. A careful analysis shows that this 
lowest weight is —to. So for a given the spectrum of the operator Lz is 
{—I, — / -I- 1, . . . , Z — 1, ^} yielding in total 21 + 1 states. Now, the eigenvalues of 
the operator can also be calculated. First rewrite as 

^"Note the use of the standard trick, AB = [A, B\ + BA, to take advantage of the commu- 
tation relations for a pair of operators A and B. 
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= -{L+L^ + L^L+) + Ll (4.75) 

using the definitions H4.68|l and (|4.()9|) . Calculating the eigenvalues of is 
now a simple matter of applying to a state m) and using the appropriate 
equations. The result is 

L^\l,m) = l{l + l)\l,m), (4.76) 
Thus the eigenvalue of in a state \l,m) is l{l + 1). 

Derivation of the normalization conditions 

In this section, we will prove that the normalization coefficients has the form 
given by equation H4.74|l . We will do it in the form of a proof by induction. To 
simplify, we will assume them to be real, i.e. (^* — C,. This is in fact a choice 
that is always possible to make. 

Proposition 

The following recursive equations hold for fc > 



L_\l,l-k) =({l-k)\l,l-k-l) (4.77) 

L+\l,l~k-l) =C{l-k)\lJ-k) (4.78) 

C{l-kf -C{l-k + lf = 2{l-k) (4.79) 

where the first equation is just a definition and need not be proved. 
Proof 

For the base case put k=:0. Then the equations read 

L_|/,0 =C(0I','-1> (4.80) 

L+\IJ ^ 1) = Cm,l) (4.81) 

C(0^ = 2Z (4.82) 

The last equation follows from the definition (|4.71(l of the highest weight 
state 

L+\l,l)^0, 

i.e. there is no coefficient corresponding to / + 1, or ({I + 1) = 0. 
The state L_|/,/) must be normalized, so that 

(/, i\L+L_\i, I) = aiycm, i - ' - 1> = c(o*c(o = c(o' 
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On the other hand, using the angular momentum algebra 



{l,l\L+L^\l,l) 



{l,l\L,\l,l) 



2l{l,l\L^\l,l) = 21 



so that 




L+\IJ-1) = ^L+L^\IJ) - ^[L+,L_]|/,0 
2 2/ 



This proves lH^KTll . 

For the induction step, assume that the equations (|4.78() (|4.79|) are true for 
a certain k. First we consider 14.79(1 for k + 1. Calculate the norm of the state 
L_|Z, I ~ k — 1) using equation ((4.77() . which as pointed out, is just a definition, 
we get 



{l,l-k^l\L+L^\l,l~k-l) = C{l-k-iyC{l-k-l){l,l-k-2\l,l-k-2) = 

C{l-k-lf. 



On the other hand, using the angular momentum algebra and the induction 
hypothesis 



(/,/-fc-l|L+L_|/,/-fc-l) = {1,1 - k - 1\[L+L^] + L^L+\l,l - k - 1) = 
{1,1- k- 1\2L, + L_L+\l, l-k-l) = 2{l-k-l) + C{l- k)*C{l - k) ^ 



which is precisely ((4.79|l with fc + 1 instead of k. 

Next we consider ((4.7811 for k + 1. Using the operator algebra and the 
induction hypothesis yields 



2(/-fc - l) + C(^-'fc)^• 
Combining the last two equations yields 

C{l-k- 1)2 = 2(? - fc - 1) + C{1 - 



L+|/,/-fc-2) 




C(/-fc-i) 



{[L+,L^]+L^L+)\l,l-k-l) 



1 



(2i, + L_L+)|;,/-fc-l) 



C(/-fc-i) 
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^(^l {2(1 -k-l) + - kf)\l, l-k-l) = 

^^i_k-i) fc - 1) - ca - fc - k - 1). 

The recursive equations 14.79|) can be solved. The resuh is 



C{1 - fc) = y/{k + l){2l^k). (4.83) 

There is an upper Umit for k. Just as there is a state with highest value 
for m, namely \l, I), there must be a state with lowest value for m. This occurs 
when k = 21 in the above equation, corresponding to 



which is H4.77|l for k ^ 21. So we learn that the spectrum for the operator 
Lz ranges in unit steps from a maximum weight of / to a minimum weight of 
—I. Furthermore, since k is an integer, and 21 — k must be zero for some k, the 
possible values of / are {0, 1/2, 1, 3/2, 2, . . .}. 

Finally, putting I — k — m we arrive at H4.74|l . 



4.3.1 Spin 1/2 

In quantum physics, there is a distinction made between angular momentum and 
spin. Angular momentum refers to, just as in classical physics, to rotational 
motion in space. Spin, on the other hand, refers to an intrinsic property of 
a particle. Though it can be thought of as some kind of rotation around an 
internal axis of the particle, just as one would do for a spinning classical body, 
there is actually no need for, and no experimental basis for such a picture. Spin 
is best regarded as an intrinsic quantity, along with other intrinsic quantities 
such as mass, electric charge et cetera. 

The simplest non-trivial, and for quantum computation, most important 
example, is the case of spin 1/2, or / — i. An example of a particle having spin 
1/2 is the electron. 

The abstract theory of angular momentum given in the preceding section, 
can be made concrete by representing the states \l, m) and operators L by vectors 
and matrices. Thinking of the preceding abstract theory as an interface, we will 
now provide an implementation. We will do it in the case ^ = ^ as that will be 
needed subsequently when discussing quantum computation. 



Implementation of spin one half 

With I = \, the spectrum of m is {5, — ^} with the highest weight state 
and the lowest weight state — This notation is to heavy, and customarily 
one uses a leaner syntax |i) and | — These two states are implemented by 
2-dimensional vectors unit vectors as 
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The components of the angular momentum operators (spin operators) are 
given by 

L.^IC, I) (4.86) 



2 VI 



2 

L. = l[l (4.88) 

But we are more interested in L+ and L- which are easily calculated from 
Lx and L- 



i+=(° I) (4.89) 
I). (4..0) 
Employing this concrete realizations, almost trivial matrix calculations yield 

i.ii)=f" n(n=o (4.91) 



thus implementing the raising and lowering operations of L+ and L_. Compar- 
ing to the previous section, it can also be checked that the normalizations ^(5) 
and C(~|) come out right. 

Next we have to check the action of L, 
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Thus wc have a concrete reahzation of the angular momentuni theory of the 
preceding section in the particular case of Z = 5. 

Connection to quantum computation 

Spin 1/2 is an example of a two-state quantum system. In quantum compiitation 
such systems are referred to as qubits. In that context, a slightly different 
notation is used. Instead of 1^) and | — the states are denoted by |0) and 
|1) respectively, and of course, they are to be thought of as quantum versions 
of the classical bit values and 1. 

Pauli matrices 

In the context of describing spin 1/2 is customary to express the angular mo- 
mentum operators in terms of the Pauli matrices. As is seen below, it is really 
a trivial rescaling by a factor 1/2 involved, but theoretical physicists seem to be 
fond of shuffling factors of 1/2 around. 




(4.97) 



(4.98) 



(4.99) 



We now leave the theory of angular momentum at this stage. 



^^As well as factors of 2, ^/2, tt et cetera! 
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Chapter 5 

General quantum theory 



In this chapter, quantum theory wiU be outhned in an abstract and formal way 
suitable for discussing quantum computational models and quantum complexity 
theory. The concepts of the preceding chapter will reappear but in a much more 
formal setting. A modern general reference to quantum mechanics suitable for 
studies into quantum computation is the book by A. Peres 

5.1 State spaces 

The state spaces of quantum mechanics are modeled on Hilbert spaces. These 
spaces can be either finite dimensional or infinite dimensional. The infinite di- 
mensional case requires a rather elaborate mathematical treatment if one wants 
to be stringent. For quantum computation it suffices to consider finite dimen- 
sional Hilbert spaces, as any real quantum computer would have a finite size in 
terms of number of states. 

In the first section, the theory of vector spaces will reviewed. The intuition 
behind the theory of vector spaces is the familiar real vectors of ordinary three- 
dimensional space. Thus if vi and V2 are two vectors in the space, so is aivi -|- 
a2V2, where ai and a2 are real numbers. These notions will be made precise 
below. 

A concise and readable physics-style reference to vector spaces is C. Isham's 
book [37|. 

5.1.1 Vector spaces 

Let V denote an n-dimensional vector space over the complex numbers C. This 
means that for any vectors Xi, X2 and x in V and any complex numbers ai, ai 
and a the following equations expressing linearity hold 



a(yi\ + X2) = axi + ax2 
(ai 4- a2)x = aix + Q!2X 



(5.1) 
(5.2) 
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ai(Q:2)x = (q:iQ:2)x (5.3) 
Ix = X (5.4) 
Ox = (5.5) 

The space V can be thought of as n-tuples of complex numbers arranged as 
column vectors so that we can write V = C".^ An inner product (or a scalar 
product) over V is defined as a complex valued function (x, y) defined on V x V, 
subject to the conditions 

(x, x) > and (x, x) = if and only if x = (5.6) 
(x,aiyi +a2y2) = ai(x,yi) +Q!2(x,y2) (5.7) 

(x,y)=(y,x)* (5.8) 

for all x, y, yi , y2 € V and all ai,a2 E C. The complex conjugate of a complex 
number a is denoted by a*. Note that (x, x) is a real number by condition (c). 

Two vectors x and y are said to orthogonal if (x, y) = 0. 

A vector space supplied with an inner product is called an inner-product 
vector space. 

It can be shown that the inner product satisfies the Schwarz inequality 

|(x,y)|<yMv'(yO^- (5.9) 

Norm 

From the inner product, a norm can be defined by ||x|| = -^/(x^lc). From this 
definition and the properties of the inner product, it follows that 

||ax|| = |a|||x|| for all complex numbers a (5.10) 
||x|| > with equality only if ||x|| = 0. (5.11) 
||x + y|| < ||x|| + ||y||, (5.12) 

ensuring that ||x|| satisfies all requirements for a proper norm on a vector space. 
The last property, (c), is the triangle inequality, which can be shown from the 
Schwarz inequality. 

A vector x is said to be normalized if ||x|| = 1. 

Quantum states and rays 

Quantum states are represented by normalized vectors ^f. Actually, if the nor- 
malized states ^ and ^' are related by the equation ^' = a^* where la] = 1, 
then ll^'ll = II ^''11 = 1, and they arc thought of as representing the same state. 
This is refereed to as saying that the two vectors belong to the same ray in the 
space. Thus it is more correct to say that quantum states are represented by 
equivalence classes of states. 

more correct way to express this is: there is an isomorphism i : V — ♦ C". 



98 



Linear independence cind basis sets of vectors 

A finite set of vectors {xi,X2, . . .Xj.} is linearly dependent it there exists some 
set of complex numbers {ai, a2, . . . a/c} (not all of them zero) such that 

k 

^ ttiXj = 

i=l 

If there is no such set of complex numbers, then the set of vectors is said 

to be linearly independent. This gives a precise way of defining the dimension 
of a finite dimensional vector space. A vector space is n-dimensional if it con- 
tains a subset of n linearly independent vectors but no subset of n + 1 linearly 
independent vectors. 

A subset {ei, e2, . . . e„} of an n-dimensional vector space V is a basis for V 
if any vector x in V can be expanded as 

n 

x = ^ajei. (5.13) 

i=l 

The numbers ai are called expansion coeffients. They are unique. This 
follows from the linear independence of the basis vectors. 



Orthonormal basis vectors 

A pair of vectors x and y are said to orthonormal pair if they are both normalized 
and orthogonal to each other. An orthonormal basis for an n-dimensional vector 
space is a set of basis vectors {ei, 62, . . . e„}, all of which are normalized and in 
which every pair is orthogonal. Or more concisely 



{e„ej) = Sij for i, j = 1, 2, . . . , n, (5.14) 

where 5ij is the Kronecker delta which is equal to 1 for z = j and otherwise. 

It can be shown that every finite-dimensional vector space has an orthonor- 
mal basis. A constructive proof of this is given by the inductive Gram-Schmidt 
procedure. Suppose {vi, V2, ... v„} is a basis for the vector space under consid- 
eration. For the base case, define 

ei = -p-. (5.15) 
Ir ill 

Then define inductively for 1 < fc < n — 1 

_ Vfc -E»ii(ei>Vfc-n)e» fr..^ 
l|vfe - Z^i=i(ei,Vfe+i)ej|| 

It is straightforward to verify by direct calculation that this yields an or- 
thonormal basis {ei, 62, . . . e„}. 
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Orthonormal basis vectors are useful as it is possible to compute the ex- 
pansion coefficients explicitly. Suppose we have a vector x with (unknown) 
expansion 

n 

X = y^Qje^ 

1=1 

Taking inner-product on both sides of the equation with an arbitrary basis 
vector Bj yields the following short calculation 

n n 

t=l i=l 
n n 

= ^ a,;(ej, Gi) = ^ aiSij = ai. 

i=l 1=1 

Thus we get the simple formula for the expansion coefficients 

ai = (ej,x). (5.17) 

Dual vector space 

To each vector x in V there is a dual vector x''' . If x is represented as a column 
vector, then x''^ is represented by a row vector whose elements are the com- 
plex conjugates of the elements of x. In this representation, the inner product 
between two vectors x and y can be defined as 

(x,y) = xlyi + .T22/2 + • ■ . + (5.18) 

This form of the inner product satisfies all the conditions H5.6|) . H5.7|l and 
(Oil . 

Another, more abstract point of view, is to consider x^ as linear functional 
from V to C defined by x'''(y) = (x, y). However, in practical calculations the 
explicit vector representation is useful. 

5.1.2 Hilbert spaces 

In order for a general vector space V to be a Hilbert space, two requirements 
must be meat: (i) there must exist a norm defined in terms of an inner product, 
and (ii) the set of vectors must be complete with respect to the norm. Com- 
pleteness means that all Cauchy sequences of vectors converges to vectors in the 
space This last point is somewhat elaborated in the next section. 
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Completeness 

A Hilbert space Ti. is complete in the sense that if {x„} is a sequence in H 
with lim„ »oo — x^lj = then there is an x in 7i to which the sequence 
{x„} converges, i.e hm^^tx) ||x„|| = x. Thus, in a Hilbert space, every Cauchy 
sequence is convergent. (The converse, that every convergent sequence is a 
Cauchy sequence, is true in every normed vector space). 

For infinite-dimensional vector spaces, this condition is not automatically 
true. In certain cases an intricate process of completing the space with all 
limits of Cauchy sequences can be preformed (in much the same way as the 
set of rational numbers are completed to form the set of real numbers). The 
requirement of completeness is needed in order that the usual tools of analysis; 
performing limits, differentiating et cetera, can be used. 

Finite dimensional inner-product vector spaces over the complex numbers 
are Hilbert spaces. This is follows from the fact that the complex numbers are 
complete with respect to their usual absolute value norm, and this is enough 
to ensure completeness of finite-dimensional vector spaces over the complex 
numbers. No process of completion is needed in this case. 

In the context of quantum computation, this is actually more than we want, 
since completeness in this sense means that there are vectors in the Hilbert space 
whose components are non-computable numbers. Non-computable functions 
could therefore be hidden within the specification of the quantum computer. 
We will return to this point later on. 

5.1.3 Dirac notation 

After this general introduction to vector spaces, we will now change the notation 
somewhat in order to be in conformity with mainstream quantum theory. 

Dirac invented a notational system that is very useful both for formal treat- 
ment of quantum mechanics and explicit calculations. A vector in the vector 
space is denoted by a ket \ ). Inside the ket, one places any symbol or symbols 
that characterizes the state, so for example the vector x can be denoted by \x). 
A generic quantum state is often denoted by \^). 

This system is very versatile. If we are working with a specific orthonormal 
basis for a certain vector space, the basis vectors can be denoted concisely by 
\i), i.e. we just record the index within the ket symbol. The expansion of a 
vector |x) in terms of the basis |i) is written as 

n 

\x)=J2aS), (5.19) 
1=1 

in analogy to equation H5.13|l . 

In this notational system, the dual to \x) is denoted by {x\ and we have 
\x)^ = {x\. A vector {ip] in the dual space is called a bra. The inner product 
is written {x,y) = {x\y). Thus, the orthonormality requirement for a basis 
becomes 
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= <5,;„ (5.20) 
and the formula (|5.17|) for the expansion cocfRcients becomes 



a,: = wx) 



(5.21) 



If the ket vector \x) is represented concretely by a column vector, the corre- 
sponding dual bra vector is represented by a row vector, the elements of which 
are complex conjugates of the elements of the ket vector, or 



Q!2 



\ck„ / 



(a;| = {al,a2, ■ ■ . ,<)■ 



(5.22) 



In this notation we have to decide how to treat the equation {x, y) = (y, x)* 
of the definition of the inner product. Referring to equation (I5.18|l we see that 
taking the complex conjugate we get 

{x, y)* = {xlyi + xly2 + . . . + <?/„)* 
= ylxi + y2X2 + . . . + y^Xn = (y, x) 
making the identification natural 



{x\yY = {y\x) (5.23) 

where 

{y\ - and \x) = {x\K (5.24) 

From now on, the Dirac notation will be used almost exclusively. The terms 
state, vector, bra and ket will be used interchangeably in the following. No 
confusion can arise. 

The Dirac system of notation captures a deep property of quantum mechan- 
ics, namely a that a physical system can be represented in many different ways. 
The explicit representation that we use carries very little significance and can 
be chosen for convenience. 



5.1.4 Tensor products 

Using tensor products new (larger) vector spaces can be built out of existing 
vector spaces. This is a common practice in quantum mechanics where for 
example the joint states |'0i2) of two independent systems can be built from the 
states of the constituents and |V'2)- 

To make this notion precise, let V and W be vector spaces of dimension n 
and m, and let \v) and \w) denote generic vectors in these spaces respectively. 
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The tensor product V (g) W is defined as the nm-dimcnsional vector space con- 
sisting of all linear combinations of 'tensor products' of vectors \v) (g) \w). This 
is an abstract product, which by definition satisfies 

a{\v) (g) \w)) = {a\v)) (g) \w) = \v) (g) {a\w)) (5.25) 

{\vi) + \V2)) (g \w) = \vi) (g \w) + \V2) (g \w} (5.26) 
\v) (g) {\wi) + \W2)) = \v) (g \wi) + \v)\W2). (5.27) 

These conditions on |w) (g) |w) suffices to prove that V0 W is indeed a linear 
vector space, i.e, the conditions (1.1) are satisfied. 

For an example of a concrete realization of the tensor product, consider two 
vectors |a;) and \y) 

i->-(::)«"^i''>-(l) 

in a 2-dimensional vector space. The tensor product of the vectors becomes 

a2Pi 

\ Q!2/32 / 

5.2 Operators and dynamical variables 

The intuition behind (abstract) quantum mechanics is that if vectors in a Hilbert 
space are used to describe states of a physical system, then it is natural to use 
linear operators acting on the space to describe the dynamics, i.e changes of 
state. A linear operator acting on a state yields a new state. In the following 
section we will make the basis for this intuition exact. The dynamical variables 
of classical physics, like position, momentum and energy, will be mapped to 
linear operators in quantum mechanics. 

5.2.1 Lineeir operators 

A linear operator A on a vector space V is a linear map from V to itself, that 
is v4 : V ^ V. The image ^(|a;)) of a vector |a;) is written A\x). The linearity 
requirement is expressed as 

A{ai\xi) + a2\x2)) = aiA\xi) + a2A\x2) (5.28) 

for complex numbers q:i,q;2 and vectors 

Two special linear operators are the identity operator Iv and zero operator 
0, with properties 
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= \x) 
0\x) = 0. 



(5.29) 
(5.30) 



The subscript V on the identity operator is often dropped when no confusion 
can arise. The symbol is somewhat overloaded, denoting ordinary complex 
number 0, the zero vector, and the zero operator. This causes no confusion in 
practice though. Note however, that |G) does not denote the zero vector! 

An operator A is invertible if there exists an inverse operator A^^ with the 
property 

A-^A = AA-^=I. (5.31) 

Matrix representation 

Linear operators on a vector space can be represented by matrices. 
Let {|1), |2) . . . , |n)} be a basis (not necessarily orthonormal) for the vector 
space. Any vector |a;) can be expanded as in (|5.19|l . 

n 

Applying the linear operator A on both sides of the equation and using 
linearity yields 

n 

A\x) = ^a,A|i). 

For each i, A\i) is a vector in the vector space, and since {|1), |2) . . . , |n)} 
is a basis, there must exist complex numbers Aji for i,j = 1, 2, . . . , n such that 
A\i) can be expanded 

n 

(5.32) 

Inserting this into the previous equation gives 

n n n n 

1=1 j=i j=i 1=1 

Representing vectors concretely as column vectors of expansion coefficients 
as in equation (1.12) we see that the action of the linear operator A on compo- 
nents of a vector \x) can be written as a transformation equation 
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(5.33) 



Already at this stage it is clear that the theory of transformations on vec- 
tor spaces offers the potential for setting up a model of computation. Letting 
the state of the computer be represented by the vector |a;), linear operators 
induces transitions between states of the computer. The details must of course 
be elaborated, which is the subject of subsequent chapters. 

If the vector space has an inner product and the basis is orthonormal there 
is a convenient way to calculate the matrix elements Aji of the linear operator. 
Taking the inner product with a dual basis vector (A;| on both sides of equation 
H5.32|l and using orthonormality of the basis vectors gives 

n n 

(fc|A|z) = (fc|5]A,,|j)=^A,,(fc|j) 



J^AjiSkj = Ak^, 



that is 



Ak^ = {k\A\t). (5.34) 

This is a representation for the matrix elements that is very often used in 
quantum mechanics. 

Note that the matrix representation of a certain linear operator on a vector 
space depends on the basis used, different bases gives different matrix represen- 
tations, and consequently the matrix elements given by H5.34II are different. 

Also note that, as we are only considering operators mapping V to V, the 
matrices representing the operators are n x n matrices. 



5.2.2 Outer products 

Let \x) and \y) be two vectors in a vector space. By the outer product between 
these vectors we mean |a:)(?/|. This can be considered to define a linear operator 
on the vector space as is seen from the following formal calculation 



i\x){y\)\z)^\x){{y\z)) = {y\z)\x). 5.35 

Thus the vector \z) is mapped to the vector fi\x) where /i is the complex 
number {y\z). 

The usefulness of this concept becomes clear when it is applied to orthonor- 
mal basis vectors. Let \i) be an orthonormal basis for a vector space V. We 
have the expansion H5.19|l of an arbitrary vector |a;) 
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n 

1=1 

and the equation (|5.21|l for the expansion coefficients 

Qfi = {i\x). 

Then consider the operator 

n 

(5.36) 

i=l 

Letting it act on the vector \x) yields 

n n 

{J2\^)mx) = Y,\^){^\x) 

i=l i^l 
n n 

= = ^Oii\i) \x). 

i=l 1=1 

But this equation is true for any vector \x) so we can identify the operator 
in equation H5.36|) with the identity operator, or 

n 

^|z)(z|=/. (5.37) 

2=1 

This is known as the completeness relation. 

The reader might worry about the ambiguities in the notation when writing 
expressions such as \x){y\z). It is not clear whether this should be read as the 
operator \x){y\ acting on the state \z) or the number {y\z) multiplying the state 
\x). However, there is no ambiguity and the expression can be read in either 
way. It denotes a certain state which can be calculated either as (|x)(j/|)|z) 
or {{y\z))\x). This is one aspect of the strength and versatility of the Dirac 
notation. 



5.2.3 Projectors 

An important class of operators are the projectors. These are operators that 
project a state onto a subspace of the Hilbert space. Suppose we have an 
n-dimensional Hilbert space with an orthonormal basis and let {{k)} 

denote a fc-dimensional subset of the basis vectors. Then consider the operators 

(fe) 

Taking {{k)} — this is just the identity operator /. 
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Next consider an arbitrary state \x) = C(j\j) and let P(^k) act on this 

state 

n n 

Pik) k) = E I ( E b')) = E E 1^) ^\^) = E 1^) 

{k) j=l (fe) j=l {k) 

effectively restricting the summation to the subset (fc). Thus P(j,-) projects the 
state onto the substate spanned by the subset of basis vectors (fc). 

An important property of projection operators is that acting twice with the 
same projector have no further action on the state. This is almost trivial, as 
can be seen by acting on more time with P(^k^ on the projected state 
Thus we find the general operator equation for projectors 

P(k)P(k)=Pik)- (5.39) 

An important subclass of projectors are the projectors Pi onto the basis 
states themselves. These are given simply by 



= (5.40) 

satisfying the equation 

PiPj=Pi5ij. (5.41) 



5.2.4 Adjoints 

The adjoint of a complex matrix A is the matrix obtained by transposing 
and complex conjugating the matrix elements 

iA,y=A;,. (5.42) 

In order to define the abstract notion of adjoint operators, the action of an 
operator on a bra vector has to be defined. This is a point where some confusion 
might arise as to how employ the notational system. 

As argued in [Dirac], the inner product of a bra vector {y\ with a ket vector 
A\x) is a complex number that depends linearly on |a;), therefore it can likewise 
be considered as the inner product of \x) with some, as yet undefined, bra 
vector. This bra vector depends linearly on {y\, so it can be considered as the 
result of applying a linear operator to {y\. Since this linear operator is uniquely 
determined by the original linear operator A it can be considered to be the same 
operator. 

Then, choosing the convention of writing the action of the linear operator 
A on {y\ as {y\A, i.e with the operator to the right of the bra, we get the two 
ways of writing the inner product discussed above 



{y\{A\x)) or {{y\A)\x). 
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But from the linearity, this 'product' is clearly associative, and we can it write 
simply as 

{y\A\x) 

where the operator can be considered to act either to the right or to left. The 
correctness of this is also born out by writing out the product concretely as 
matrices and row and column vectors. 

The adjoint of the operator A is defined as that operator A"^ , which acting 
on an arbitrary bra vector (ccj, yields the same vector as the dual to the vector 
A\x), or in formulas 

{x\A^ = (^|a;))^ (5.43) 

Again, this definition can be justified using linearity. 
From H5.43|l follows the important property 

{y\A^x) = {x\A\yr. (5.44) 

This equation could in fact be used as an alternative definition of the ad- 
joint. Let us derive it since the short calculation illustrates the workings of 
the formalism. Start with the right hand side, taking the complex conjugate of 
{x\A\y) 

{x\A\yr = {{xKA\y))y = {Aly)^ {x\^ = {y\A^\x) 

where in the first step parenthesis are introduced to emphasizes which parts of 
the expressions are grouped together, next equations H5.23II and (|5.24|l are used, 
and finally the definition of the adjoint (|5.43l) is employed. 

Note that in terms of matrices, the adjoint is the same as the conjugate- 
transpose, or = (A*)^ 

Hermitean and unitary operators 

Of special interest in quantum mechanics are hermitean and unitary opera- 
tors. They play the roles of representing observable quantities and generators 
of transformations respectively. 

An operator A is said to be hermitean or self-adjoint if 

A^ = A. 5.45 

An operator U is said to be unitary if 

U^^U^K 5.46 

Hermitean operators corresponds to observable physical quantities. Unitary 
operators corresponds to transformations of states. 
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5.2.5 Composition of operators 

Since a linear operator acting on a state is again a state, composition of operators 
is naturally defined as 

{AB)\x) = A{B\x)) = AB\x). 5.47 
Composition is associative 

A{BC) = {AB)C = ABC 5.48 

so there is no need to use parentheses. 

The 'product' of two operators A and B can be concretely realized in terms 
of ordinary matrix multiplication given matrix representations of the operators 
(in the same basis). As in matrix multiplication, the product is in general 
not commutative. The non-commutativity of quantum mechanical operators 
is an important property of quantum mechanics and leads to the celebrated 
uncertainty relations connecting results of measurements of non-commuting ob- 
servables. But we will come to this is due time. 

Commutators 

Certain combinations of operators often occur in quantum mechanics. One is 
the committotor between two operators A and B. It is defined as 

[A,B]^AB-BA. (5.49) 

Since, in general, operators don't commute, the commutator is in general 
non-zero. Sets of hermitean operators that do commute among themselves are 
especially important in that they can represent sets of physical quantities that 
can be measured simultaneously. 

5.3 Transformations and symmetries 

Unitary operators effect symmetry transformations of quantum states and quan- 
tum operators. Symmetries are transformations of states that do not affect 
observable quantities, i.e they do not change the results of measurements. 

A first look at measurement 

A measurement always results in a number. It is a fundamental property of 
quantum mechanics that the states themselves are not observable or measurable. 
Essentially the only way to get numbers out of quantum mechanics is by taking 
inner products of states. Since the states are described by vectors it is reasonable 
to suspect that physical quantities are described by linear operators. Thus 
the result of a measurement is in some way related to inner products of the 
form Such inner products are often called diagonal matrix elements 

in analogy to H5.34|l . Furthermore, if A is an hermitean operator, is 
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a real mimbcr which is interpreted as the expectation vahie for the quantity 
represented by A. The theory of measurement will developed in section 5.6 
after some more terminology is introduced. 

Symmetry transformations 

In order to study symmetry transformations, suppose and are two quan- 
tum states. Acting on these states with the imitary operator U we get the 
transformed states U\ip) and U\(f)). It is customary to write transformations as 

1^) ^ 1^') = c/|^). (5.50) 
Likewise, the transformation of a bra vector is 

{^\ ^ = (5.51) 

That this is the correct form of a transformation of a bra vector follows from 
the equation (/7|(/)))^ = (0|J7^ applied to the transformation of a ket vector. 

With U a unitary operator, the inner product between the states |^) and 
IV') is unaffected by this transformation. We get 

^ (</.|[/tf/|v) = m-'u\i^) = {m- 

The corresponding form for a transformation of a linear operator can be de- 
rived by demanding the matrix element {(j)\A\'il)) to be invariant under a trans- 
formation. We know how to transform states of the form |^). Consider trans- 
forming states of the form j4|-0), i.e statcis acted on by a linear operator A. The 
transformed state is UA\ip), which can be expanded as 

UA\ij;) = UA{U-^U)\ij;) = {UAU-^)U\ij;). 

In this way we separate the transformation of the state A\^) into a transfor- 
mation of the state and the operator A. Thus it is natural to define a 
transformation of a linear operator A as 

A ^ A' = UAU-^ = UAUK (5.52) 

That this is a reasonable definition is born out by calculating the transfor- 
mation of the matrix element ((/(lAjf/;) 

{ct>\A\i') ^ mu^)uAu\um 
= {<t>\{u^u)A{u^um) = mm), 

which shows the invariance of the matrix element under the transformation. 

So although states and operators are affected by symmetry transformations, 
measurable quantities are not; this is the essence of symmetry in quantum me- 
chanics. 
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5.4 Eigenvectors and eigenvalues 



An eigenvector of a linear operator A on a vector space is a non-zero vector \x) 
such that 

A\x) = X\x) (5.53) 

where the eigenvalue A is a complex number. Introducing the identity operator 
on the right hand side of the equation, it can be rewritten as a proper matrix 
equation 

{A-XI)\x)^0. (5.54) 

From the theory of linear equations it follows that this equation has no non- 
zero solutions \x) unless the determinant of the matrix A — XI is zero. If the 
determinant is zero, then the vector \x) is identically zero since the equation 
is homogenous (the right hand side being zero). So, in order to get non-trivial 
eigenvectors, we require 

det(A-A/)=0. (5.55) 

This equation, called the secular equation, is an n-th degree equation for the 
unknown A and thus it has n complex, not necessarily distinct, roots. These are 
the eigenvalues of the operator A. Once the eigenvalues are known, the corre- 
sponding eigenvectors can be calculated. Several distinct, linearly independent, 
eigenvectors might correspond to one and the same eigenvalue. In that case 
the eigenvalue is said to be degenerate. The degree of degeneracy is equal to 
the number of distinct linearly independent eigenvectors corresponding to the 
degenerate eigenvector. 

Diagonalization 

A matrix is said to be diagonal if it has non-zero elements only on the diagonal. 
Using outer products of basis vectors |i) a diagonal operator can be written as 

A = ^A.|i)(i|. (5.56) 

i 

That this actually represents a diagonal matrix is clear from using equation 
H5.34|l to compute the matrix elements. We get 

(fc|A|j)-^A,(fc|*)(*|j)=Afc5fc,. 

i 

That is, only the diagonal elements are non-zero, and equal to the numbers 

A.. 

It would be natural to identify the A,; with the eigenvalues of the operator. 
Indeed, a diagonal operator trivially satisfies the eigenvalue equation (1.35) with 
eigenvectors equal to the basis vectors So the question arises, when is an 
operator diagonalizable? 
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5.4.1 Spectral decomposition 



Here we will just state an important theorem that allows us to use diagonal 
representations for certain classes of operators. 

An operator ^ on a vector space is said to be normal if A'' A — AA'' . It 
follows immediately that a Hermitean operator is normal. Also, any unitary 
operator U is also normal. This follows from the simple calculation 

u^u = u-^u = 1 = uu-^ = uul 

Spectral decomposition theorem 

Let A be a normal operator on a vector space V. Then A can be diagonalized 
with respect to some orthonormal basis for V. Conversely, any diagonalizable 
operator is normal. 

This means that the eigenvalue equation H5.53|l can be solved and the oper- 
ator can be represented explicitly as in equation (|5.5()|) . To be definite, 

i 

where \i are the eigenvalues of A, \i) is an orthonormal basis and each \i) is 
an eigenvector of A. This equation can also be trivially rewritten in terms of 
projectors Pi 

A^J2\P,. (5.57) 

i 

In particular, hermitean and unitary operators can be diagonalized. Proofs 
of the spectral decomposition theorem can be found in |2H1 and • 

Diagonalization using unitary transformations 

We saw in section 5.3 that symmetry transformations are effected by unitary 
operators. Choosing a suitable unitary operator, a normal operator can be 
transformed into a diagonal form. Let A be a normal operator. We want to 
find a unitary operator D that transforms A into diagonal form. Explicitly as 
in 

A' = DAD-\ (5.58) 

where we demand that A' is diagonal. In order to be concrete, we represent the 
operators by the corresponding matrices, so that A'f.^ = A'j^Ski- Then multiplying 
the equation (|5.58l) by D from the left gives DA = A'D. Writing this last 
equation in terms of matrices yields a short calculation (note the subtle changes 
of indices) 

^ ^ D km Ami — ^ ^ ^km^^nl — ^ ^^ Af^SkmPml = Af^Dkl = Aj^SmlDkm, 
m m m m 
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or 



m 

Now, fixing the index k, we have n homogeneous equations for the transfor- 
mation matrix elements Dkm- These equations have non-trivial solution if and 
only if the determinant of the coefficients vanish, i.e. that the determinant of 
the matrix Ami ~ A'^^^^ai is zero, 

det{A„,l - J^k^ral) = 0. 

Thus we get back the secular equation H5.55|l explicitly. Solving this equa- 
tion, we get the n, not necessarily distinct, eigenvalues of the original matrix A. 
Note that the eigenvalues does not depend on the diagonalizing matrix D. 



Diagonalization of hermitean operators 

Suppose A is a hermitean operator. Then it is diagonalizable and can be written 
as in equation H5.56|l . 

A = ^A,|i)(z|. 

i 

Taking the hermitean conjugate, we get 

At=^(A,|z)(*|)t=^A:|z)(z|, 

i i 

since, obviously (|^)(^|)^ — for each i. But A — J^so that we must have 

^A*|z)(z| = ^A.K)(*|. 

i i 

This is only possible if all eigenvalues are real numbers, or aJ = A^. 

Thus, hermitean operators have real eigenvalues. And conversely, if an op- 
erator have all eigenvalues real, then it is hermitean. 



Simultaneous diagonalization theorem 

Suppose two operators A and B are diagonal in the same basis. Then it is easily 
shown that they commute. This follows since the product of two diagonal ma- 
trices is itself diagonal, and the elements on the diagonal is simply the product 
of the diagonal elements of A and B. 
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The converse is also true, if two operators commute, then they are simulta- 
neously diagonalizable in the same basis. For a proof, see |39| . 

5.5 Quantum dynamics 

The time evolution, or dynamics, of a closed quantum system can be described 
in two related ways. A system is closed if there is no interaction with the rest of 
the world. In practice, this might not be a realistic assumption. In principle it 
is not possible to isolate one piece of the world from the rest, there are always 
interactions between system and environment. The assumption is that either 
this interaction can be arbitrarily weak or controlled. The usefulness of the 
closedness assumption is that all of the systems dynamics is encoded in the 
Hamiltonian. 

Schrodinger equation 

Traditionally, the dynamics is described by the Schrodinger equation. This is 
a (first order) differential equation in the time variable t, equating the time 
derivative of the state to the action of the Hamiltonian operator on the state 



where ?i is a fundamental physical constant setting the scale of quantum phe- 
nomena.^ In theoretical contexts its value is often taken to be 1. This amounts 
to a choice of measuring units, where the 'natural' scale of phenomena is taken 
to be submicroscopic. 

The form of the Hamiltonian depends on the representation chosen for the 
states. In the configuration space representation presented in chapter 4, the 
states are wave functions, and the Hamiltonian is a partial differential operator. 
In quantum computation contexts, H will be a matrix acting on superpositions 
of computational basis states. 

Finding the proper Hamiltonian for a physical system is in general a difficult 
problem. It is not generally considered as a question within quantum mechanics 
itself, since as we have pointed out, quantum mechanics is just a framework for 
formulating physical theories. However this last point might very well change 
as our understanding of fundamental physics develops.'^ 

Being a hermitean operator, H can be diagonalized. Since the Hamiltonian is 
physically related to the energy of the system, the corresponding eigenvalues and 

2lts value is 6.626 ■ IQ-^-* Js 

''It might be the case that a would be "theory of everything" comes packaged with quantum 
mechanics as an inseparable part. 



(5.59) 



dt 
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eigenstates are referred to as energy eigenvalues and energy eigenstates. Naming 
the eigenvalues with £"„ and eigenstates \n), we have the spectral decomposition 

H = Y,En\n){n\. (5.60) 

n 

The lowest energy eigenvalue is the ground state energy and the correspond- 
ing eigenstate is simply theground state. It is interesting to insert the expansion 
(|5.60|l into the Schrodinger equation. A short calculation yields the following 
equation holding for an arbitrary energy state 

ih—— = En\n}, 
dt 

the solution of which is 



\n) — CKp(~iEnt/h)\n). 

So, in a certain sense, the time dependence for energy eigenstates are trivial, 
it is just an overall oscillating phase factor. It does play a role in superpositions 
though, where different eigenstates oscillates with different frequencies. The 
frequency of oscillation uj is defined by a; = E/h so that the energy is often 
written as £'„ = htOn- 



Unitary transformation 

The time development of a quantum system between two times ti and t2 can 
also be described by a unitary transformation. Let \ip{ti)) and \ip{t2)) be the 
state at the two times respectively. Then 



m2)) = u{tut2)mi)) (5.61) 

where U{ti, t2) is a unitary operator that depends only on the two times ti and 
^2, i.e. there is no other time dependence in U . 

The two ways of prescribing the dynamics of the quantum state can be 
related by formally solving the Schrodinger equation. Provided we grant us 
the privilege to formally exponentiate the Hamiltonian operator, a solution to 
equation H5.59|l can be written 



IV'(O) = exp(-ii/t/;i)|V'(0)). (5.62) 

The intuition here is that the state starts in the state at an initial 

time i = and develops into |V'(^)) at time t. Let us check this by a short 
calculation 



{iTi){-iH/h) exY,{-iHt/Ti)\ij{0)) = H\il;{t)) 
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Clearly, this calculation presupposes that the Hamiltonian has no explicit time 
dependence. 

Equation (|5.fci2|) can also be written somewhat more generally as 

m2)) = exp{-iHit2 - h)m{h)), (5.63) 

expressing time development from time ti to time ^2- Now comparing this 
formal solution to the Schrodinger equation we see the connection between the 
two ways of prescribing the dynamics of the system. The unitary operator U 
should be equated to the exponential of the Hamiltonian, or 

C/(ti, ta) = exp(-ii?(t2 - h)/h). 

Unitarity and reversibility 

The unitarity of time evolution leads immediately to the reversibility of the 
dynamics. Since U{ti,t2) is invertible we can recover the 'initial' state \ip{ti)) 
from the 'final' state |'0(^2)) by multiplying the equation (|5.61|) by ?7(ii,i2)^ 

u{h,t2)^m2)) = uih,t2yu{h,t2)mi)) = mi))- 

There is one more issue that must be clarified in the context of quantum 
dynamics, and that is the different so called 'pictures'. In quantum dynamics, 
there is a choice as to where the time dependence resides. One choice is to 
let the states carry all the time dependence and letting the operators be time 
independent. This is the Schrodinger picture. Another choice is to have the 
states themselves be time independent and letting all time dependence be carried 
by the operators. This is the Heisenberg picture. There are also intermediate 
pictures, where the time dependence is split in a well defined way between states 
and operators. One such picture is the interaction picture, which is useful in 
calculations. 

5.5.1 Schrodinger picture 

Of the Schrodinger picture there is not much more to be said. In fact, it is 
the Schrodinger picture that we have implicitly used in the preceeding sections. 
The Hamiltonian and all other operators are time independent and the time 
dependence is carried by the states. Formally solving the Schrodinger equation 
as in (|5.62|l makes this explicit. 

5.5.2 Heisenberg picture 

The transition to the Heisenberg picture is interesting and we will carry it 
through in some detail. First let \ipg{t)) and {(psit)) be two quantum states 
where the index S is used to indicate that these are taken in the Schrodinger 
picture. Later in the discussion we will introduce the corresponding Heisenberg 
picture states \tpH) = IV's(O)) and \(f>H) — |0s(O)). Next consider an operator 
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As- In order to transform all time dependence from the states to the operator, 
we will consider the time derivate of the matrix element of As with \ipsit)) and 
|(^s(i)). Thus we perform the calculation 

j^{i:s{t)\As\Mt)) = 

ij^{Mt)\)As\Mt)) + (Mmsj^iicbsm - 

^{^smAsH - HAsMsit)) = ^{MmAs,H]\(t>s{t)), 
in in 

where the Schrodinger equation has been used. 

Next we use the formal solution (|5.t)2|) to the Schrodinger equation 



{i:s{t)\^exp{iHt/h){iPsm (5.64) 
|0sW) = eM-^Ht/hMsiO))- (5.65) 

Substituting these expressions into the last step of the calculation gives 

4 (V's (0) I [eM^Ht/h)As eM~^Ht/h) ,H]\c^s (0)) , 
in 

where we have used the fact that H commutes with exp{±iHt/h). 

This is the proper place to define the Hcisenbcrg picture operator Ah 

Ah = exp{iHt/h)As exp{~iHt/h). (5.66) 
Thus the result of this calculation is 

^{^Ps{t)\As\Mt)) - ^{i'H\[AH,H]\M. 
at in 

Furthermore, making the substitutions (|5.64|) . (|5.65|) and H5.66|l in the left 
hand side also, yields 

^(MAhIM = ^{M[Ah,H]\^h). 
at in 

Now, since the states [ip) and |0) are arbitrary, this equation must be valid 
for the operators 

(5,67) 

The discussion above shows that the time dependence can be transformed 
from the states to the operators. The dynamical equation (|5.67|) can however 
be more easily derived directly from equation l|5.66|l by direct differentiation. 

As a last point, we can now make contact with the discussion in chapter 4 
where we discussed quantization of classical systems. The dynamical equation 
in the Heisenberg picture is actually identical to equation H4.42|l of chapter 4. 
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5.6 Quantum measurement 



Measurement is the process of getting numbers out of quantum systems. It is 

perhaps the most non-intuitive aspect of quantum nwx'hanics. It has also been 
(and still is) an area of controversy and discussion related to the interpretation 
of quantum mechanics. In classical physics it is in principle always possible to 
find out all properties of a state to any desired degree of accuracy by making 
appropriate measurements. Not so in quantum mechanics where the state it- 
self is not measurable. The only information we can get out of the system is 
certain numbers (corresponding to the eigenvalues of operators) with certain 
probabilities predicted by the theory. We will not review the vast literature 
about quantum measurement here (which would an herculean task) but rather 
present modern main stream measurement theory.^ 

Some intuition can be obtained from thinking about how things are gener- 
ally done in quantum mechanics. States are transformed by applying (unitary) 
operators to them. Especially, dynamics is expressed by applying the unitary 
time development operator U{t2,ti) to the state. Another clue comes from the 
eigenvalue equation, where applying an hermitean operator to a state extracts a 
real number. Therefore it makes sense to define quantum measurement in terms 
of applying an operator to a state. 



5.6.1 Projective measurement 

The context of projective measurement is the following. Suppose that the system 
under consideration is in an (unknown) state \tp). The object is to measure a 
certain physical quantity O say. This quantity is represented by an hermitean 
operator O. Then O can be diagonalized and be written 

n 

in terms of the projectors Pj. Now the only (real) numbers that are present in 
this context are the eigenvalues Aj so we might suspect that the outcomes of the 
measurement must be related to these numbers. Measurement is then defined 
by the following postulate 

Measuring the operator O in the state gives the result Aj with probability 

p{\i) = (VlPilV). (5.69) 

If the outcome of the measurement was A^ , the state of the system immedi- 
ately after the measurement is 

(5.70) 



''However it might be that the proper understanding of the interpretation and measurement 
issues in quantum mechanics could have bearings on computation theory. 
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Without loss of generality we can think of the state in terms of an expansion 
in the eigenstates of the operator O 



i=i 

Note, however, that the coefficients in this expansion are unknown, unless 
we have deliberately prepared the system in a certain superposition. Now the 
probability can be calculated explicitly 

n n n 

p(A.) = {Y.al{k\)P,{Y,a,\j)) = {J2aUk\)a,\^) = 

k=l j=l k=l 

a*ai = |Q!ip. 

The state of the system after the measurement becomes 
5.6.2 General measurement 

It is possible to define a slightly more general concept of quantum measurement. 
Instead of having an observable with a spectral resolution as in equation (|5.t)8|) , 
it is sufficient to have a set of measurement operators {Mi} acting on the state 
space of the system. The index i refers to the outcome of the measurement. 
These operators are subject to a completeness requirement 

Y^MlNh = I (5.71) 

i 

expressing the fact that the sum of probabilities must be 1. This follows 
since the probability for outcome k is defined as 

p{k)^{'il^\MlMk\iP). (5.72) 
In this case, the state of the system after the measurement is 

MM) 

(5.73) 



It is thus easy to see that projective measurements are a special case of gen- 
eral measurements. To go from a general measurement to a projective measure- 
ment, we demand that the measurement operators Mi are orthogonal projectors, 
that is they are hermitean and satisfy MiMj = SijMi. 
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5.6.3 POVM meeisurement 



Yet another special case of general measurements is the so called POVM'"' mea- 
surements. The idea is the following. If one is only interested in the prob- 
abilities, and not the resulting states, it is enough to know the combination 
M^Mfe, the operators Mk themselves are not needed. If we define new opera- 
tors Ek = M^Mk, then Ek is a positive operator such that Ei = I and the 
probabilities are given by p{k) — {tp\Ek\il})- 

Turning this argument around, a POVM measurement is defined by any set 
of operators {Ei} such that each operator Ei is positive and the completeness re- 
lation ^ . Ei = I holds. The probability of outcome k becomes p{k) = {ip\Ek I'lp). 
Of course, not having the "square roots" of the Ei, the new states cannot be 
computed. 



® Positive Operator Valued Measurement 
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Chapter 6 

Abstract quantum 
computation: The circuit 
model 



In this chapter, the theory of quantum computation wiU be outhned in an ab- 
stract way, without recourse to physical considerations. This is in analogy with 
how classical computation theory is generally developed, relying on models of 
computation which abstracts away from the details of actual physical comput- 
ing machinery. We will first look at the quantum Turing machine model, but 
will then turn to the quantum circuit model. 

The quantum Turing Machine (QTM) is a quantum generalization of the 
classical Turing machine, and as such, can be considered to be a model of 
a programmable quantum computer. It is not a practical model for either 
algorithm construction or actual physical implementation. The QTM model is, 
however, useful in discussing complexity theory, and has been used extensively 
in that context j40j. 

The quantum circuit model (QCM) is easier to work with as regards algo- 
rithm construction, and is closer to a physical realization. 

The models are claimed to be equivalent, but there seems to be a few unclear 
points, mainly having to do with subtleties as regards the QTM's. This is an 
area of active research. For that reason, QTM's will not be treated here. 

The original references are and 

6.1 Quantum alphabets, strings and languages 

As a starting point, let us see how far a quantum generalization of the concepts 
of alphabets and languages will carry us. An alphabet is a finite, non-empty set 
of symbols. Here we will use such an alphabet to label the quantum states of a 
system. This is an abstract labeling and the realization in terms of a concrete 
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physical system will, as said, not concern us here. Suppose we have an alphabet 
E = {5*1, S'2, . . . , S'ji}, then the corresponding set of quantum states are 

Sq = {|5i),|52),...,|5„)}. (6.1) 
These states are taken to span an orthonormal basis, i.e. 

{S,\S,)^S.,. (6.2) 

It seems reasonable to introduce the term quantum alphabet for such sets of 
states, though the term is not in common use. 

Next, the classical concept of a string of length L, " Si^Si^ ■ ■ ■ Si^'' , is gener- 
alized to the corresponding composite quantum state 

(6.3) 

This notation is a shorthand for the direct product notation 

\S^,)(E)\Si,)(E)---(E)\Si^). 
The orthonormality condition H().2|l generalizes to 

{S\T) - SsT, (6.4) 

where the notation S is used for the string "S'ijS'ij • • • Si^^\ If the string length 
needs to be recorded we could write Sl ■ 

It is clear that the symbols Si only serve as labels, each of them ranging 
over the set S. Just as in the classical case, the actual symbols used play no 
role. The main difference as compared to the classical case is the possibility to 
consider linear superpositions of the states \S). Thus the states of Sg span a 

-dimensional Hilbcrt space isomorphic to the complex vector space (C")'*'^. 

Imposing a lexicographic ordering lex : {Sl} N, we have the general state 
\^)l 

aiex(SL)\'SL), (6.5) 

lex{SL) 

which could be denoted a quantum string of length L. Here \Sl) play the role 
of basis states. 

Normalization of the states i-e. the demand that — 1 leads to 

the usual restriction on the coefRcients 

E l«/e.(5.)P = l. (6.6) 
iex(5i) 

The states considered so far have a fixed string length and are therefore finite 
in number. This is the normal situation when describing discrete quantum 
systems. In order to support Turing-like quantum computation, we have to 
consider arbitrary length strings. This is because fixed string length implies a 
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finite size memory. Macliines with a finite size memory are not really Turing 
machines, they are finite state machines and as such do not support universal 
computation. So, just as in the classical case, the tape must be potentially 
infinite. 

A potentially infinite tape in the quantum case implies a potentially infinite 
dimensional Hilbert space. When a new tape square is added (or activated) 
during the computation, the tape Hilbert space goes from n^-dimensional to 
J7, ( i + 1 ) _ (iiniensional . 

Classically this situation is described by considering the set of all strings S* 
over the alphabet S. The set E* thus contains all strings that could be written 
on the potentially infinite tape. In the quantum case, the state in IS. 51 

contains all length L strings, provided all the coefficients are nonzero. Thus the 
analogue of the classical set S* ought to be {|V')i}LLo' which we will denote as 
Eg, the set of all quantum strings. It is the set of all superpositions of states 
\Sl) for all string lengths L. 

Classically, a language is a subset of E*, or equivalently, an element of the 
power set 7'(E*). By analogy, we define a quantum language as a subset of 
the set of all quantum states, i.e. as an element of the power set 7'(Eq) = 

nmL}T=o)- 

A leaner notation 

The notation introduced so far, being a generalization of classical notions, is a 
bit to heavy handed. Utilizing the economy of the Dirac notation, we write \i) 
instead of jiSi), just keeping the indices. In the case of an alphabet with, say n 
symbols, i can be thought of as ranging over the numbers 1,2, ... ,n. Thus the 
state of equation is written \i1i2 ■ ■ ■ ih)- A general state is then 

^ a^^l2■■■^L Ni«2 ■■■iLl- (6.7) 

In conclusion then, \i) spans a n-dimensional Hilbert space isomorphic to 
C". Likewise, \i1i2 ■ ■■iL) spans a n^-dimensional Hilbert space isomorphic to 
(C")®^. A general state in this space is given by 16. 71 

Qubits 

A special case of this construction is the qubit and the qubit string corresponding 
to the alphabet {0, 1}. Let x denote a single bit. A classical n-bit string is then 
" X1X2 . . ■Xn\ A single qubit is denoted by \x) and a multi qubit state by 
\x1X2 . ■ . Xn)- These states are called computational basis states or the classical 
basis |26| . Measurements on the quantum computer are often thought of as 
being made in this basis and it can therefore be considered as the connection to 
classical I/O-streams. 

A convenient notation is often used for the computational basis states. Using 
the binary number representation for natural numbers, we can denote the state 

\x1X2 . . . Xn) = \xi ■ + X2 ■ 2"-2 + . . . • 2 + a;„)„. (6.8) 
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The subscript n is needed to indicate the number of qubits, but is often 
suppressed. An example of this notation is |1001) = |9)4. 

A general state of the quantum computer is a complex linear superposition 
of the basis states 



2"-l 
i=0 

and the coefficients are again subject to the restriction 



(6.9) 



E 

1=0 



1. 



(6.10) 



The Hilbert space of one qubit is isomorphic to the complex vector space 
C^, and the n-qubit Hilbert space is isomorphic to (C^)®". Thus, an n-bit 
quantum state carries exponentially more information than an n-bit classical 
string. Classically, it is not possible to linearly combine different bit strings.^ 

We also record the sometimes convenient explicit representation of the basis 
states in terms of 2" -dimensional basis vectors, for example 






1 



Vo/ 



|5)c 



6.2 The circuit model of quantum computation 

The quantum Turing machine model is not practical when it comes to actual 
algorithm construction, and just as in the classical case, it is a theoretical con- 
struct far from real computer design. The model is difficult to work with since 
the state of the computer is a superposition of not just the data on the tape, but 
also the head position and the internal configuration. This leads to tricky ques- 
tions about the halting of the machine, as different branches of the computation 
may take different number of steps to complete their respective computations. 

m 

The quantum circuit model is a generalization of the classical circuit model. 
Instead of bits, qubits are transmitted in the wires. The classical logic gates are 
replaced by quantum gates represented by unitary operators. In this model, the 
state of the computer is a superposition of the data only. The actual wiring of 
the circuit and the number of gates applied to the data are treated classically. 

^The classical case could be seen as corresponding to all the coefficients except one being 
zero, the non-zero one being equal to 1. 
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The model is formulated in terms of unitary computation matrices, that 
given arbitrary n-qubit input vectors, produce the desired n-qubit output vec- 
tors. Algorithm construction amounts to composing such matrices out of sim- 
pler, primitive matrices acting on just a few qubits at a time. There are two 
important questions; finding a universal set of building blocks, i.e. programming 
primitives, and finding methods for efficient algorithm construction. 

The circuit model is, however, subject to certain limitations. Its classical 
counterpart is the reversible logic circuit described in chapter 2. A logic circuit 
computes a fixed function for a given range of input, say n bits. If the output 
is required for data beyond this range, the circuit must be extended to m > n 
input bits. In principle, an algorithm is needed for this, or put differently, an 
algorithm is needed to generate the uniform circuit family {C„}J^]^ computing 
the required function for arbitrary length input. The task of assembling the 
uniform circuit family cannot be performed by another circuit |25| . Therefore, 
in itself, the circuit model is not a complete computational model, and the same 
is true for quantum circuits. 

Therefore, we must refine the notion of algorithm construction in the circuit 
model, to providing a uniform circuit family for the problem at hand. 



c 



Figure 6.1: A general circuit. 

Since each wire carry a two-state quantum bit, an n-qubit circuit C„ per- 
forms a unitary operation represented by a 2" x 2" unitary matrix Uc„- 

If all the input wires are used for data, then a given circuit performs one 
and the same algorithm on the data. Thus, the program is hardwired into 
the circuit, and in this sense, the circuit is not a general purpose computer. 
But nothing prevents us from considering some of the inputs as supplying a 
program, or rather an instruction, to be carried out on the rest of the input, 
which is then the data proper.^ Universality in the context of the circuit model 
will be discussed below. 

■^Of course, on a certain level of abstraction one need not make any distinction between 
data and program. 
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6.2.1 Gates and wires 



An abstract quantum circuit is built out of wires and gates. ^ The wires carry 
the qubits between the gates, from outputs to inputs. The qubit processing 
takes place in the gates. There is to be no feedback wires. The number of 
output wires and input wires are equal for individual gates as well as for the 
complete circuit. 



Figure 6.2: A generic quantum gate. 

A generic quantum gate G performs the unitary operation \ipout) = C^clV'm)- 
The NxN unitary matrix Ug„ representing an n-qubit gate with N = 2" belong 
to the Lie group U{N). 

6.2.2 General notation 

A general one-qubit unitary gate is represented by a 2 x 2 unitary matrix 

\Uio Uii J 

and likewise, n -qubit gates are represented by 2" x 2" matrices where the matrix 
elements are denoted by Uij with indices ranging from to 2" — 1.^ For fixed 
index i, Uij are row vectors, and for fixed index j, Uij are column vectors. In 
either view, unitarity for the matrix is equivalent to orthonormality of these row 
and column vectors respectively. This is a property of unitary matrices that is 
often useful precisely when deciding unitarity. 

These matrices are realizations of unitary operators U in some orthonormal 
basis, most often in the computational basis. Graphically they are represented 
by gates or circuit elements.^ 

An important class of 2™+^ x 2"'+^ matrices are the controlled gates A,„([/) 
defined by 



^We leave open the physical implementation of the wires. They should not be thought of 
as classical wires, but rather in terms of some unspecified interaction between gate outputs 
and inputs, or by gates sharing qubits. 

* Other index ranges are sometimes convenient. 

""A few words on terminology; a gate {circuit element) is represented by a unitary operator 
which is realized as a unitary matrix in some basis. The same holds for the complete circuit, 
itself built out of gates. 
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{Uyo\xi, . . . ,X„i,0) + Uyi\xi, . . .,Xm, 1) 
if A^^i Xk = 1 
(6.12) 
\X1, ...,Xm,y) 
if A',"^i Xk^O 

or in a different notation where xiX2---Xn denotes the product of the bits 

A„(f/)|2:i,...,2:„,?/) = |xi,...,a;™,y)C/'=i'^^-'^"|y), 
or, explicitly, in block-matrix form 

/2- \ 

uoo "01 , 
uio uii J 

where denotes the 2™ x 2™ identity matrix. The operator Am{U) applies 
the operation U to the (m + l)-th qubit conditioned on the first m qubits all 
being equal to 1, otherwise nothing is done. The controlled operations are the 
quantum analogs of the selection primitive of classical computation. 

Another notation in common use for controlled gates, is C™'{G). The gate 
G need not be a single qubit gate, though in most cases it is. 

Figure shows the diagrammatic representation of a A2(G) gate. Note 
that conditioning on the control bit being 1 is denoted by an filled circle on the 
corresponding wire. 



G 



Figure 6.3: A AaCG) gate. 

There is nothing special about conditioning on 1 . It is sometimes convenient 
to condition on 0, or on combinations of and 1 for different control qubits. 
No special notation will be introduced for this case, but I will refer to it as 
a generalized A„(G'), and an example is given in figure to exemplify the 
concept. 
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G 



Figure 6.4: A generalized A2(G) gate with conditioning on and 1. 



6.2.3 Special discrete one-qubit gates 

We first list a set of simple 1-qubit gates. In section 4.3.1, the spin-1/2 Pauli 
matrices were introduced. With a change of notation they are 




(6.13) 



For the algebraic identities and commutation relations satisfied by these 
matrices, refer back to section 4.3.1. 

Apart from these gates, the Hadamard gate H, the phase gate S, and the 
7r/8-gate T are given by the matrices 



These single qubit gates are important, as they can be used together with 
the CNOT-gate to give universal sets of discrete quantum gates. 
When simplifying circuits, the following identities are useful 

HXH^Z, HYH^-Y, HZH = X. (6.15) 

The Hadamard gate can be used to produce equally weighted superpositions 
as the following simple example shows 



i/|0) = i=(|0) + |l)), (6.16) 
^|1> = ^(|0)-|1)). (6.17) 

6.2.4 One-qubit rotation operators 

By formally exponentiating the Pauli matrices one obtains a set of continuous 
rotation operators 
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R^ie)=e-^'''/'^co4l-^sin-X= ^""'^ ^''Z'], (6.18) 



i?,(0)=e-'^^/2 = cos^/-^sinV^ r°!I ] , (6.19) 



R^e) = = cos ^/ - z sin = ( ^ ''^^ ,L ) . (6.20) 



r ~ '''''' 2"" = y e^«/2 

The matrices in the right hand sides of these equations can be derived using 
the following general formula for functions of Pauli matrices 

f{9n-a) = /H n-a, (6.21) 

where h — ,ny,nz) is a three dimensional normal vector and W — {cr,j:, ay, az) = 
{X,Y,Z). 

From the definition of the rotation operators, it is clear that the following 
equations hold 

Ri{a) + Ri{f3) = R^{a + 13) where i e {x, y, z}. (6.22) 
Some more useful identities are 



XRx{e)X = Rx{e), (6.23) 
XRy{0)X = Ry{~0), (6.24) 
XRz{6)X = Rz{-e). (6.25) 

Obviously, there are lots of such simple identities for single qubit gates. 

6.2.5 The rotation operators and the Bloch sphere 

The rotation operators can indeed be interpreted as rotations around the x, y 
and z axes respectively, in a three dimensional space. In such an interpretation, 
a general qubit \'ip) can be represented as 

1^) =cos^|0)+e^*sin^|l), (6.26) 

with angles 9 and (j) defined as in figure 1^31 

The basis vectors |0) and |1) are represented by (0,0, 1) and (0,0, —1) (cor- 
responding to = 0, = and 9 — T:,<f) — Q), respectively. 

Note that this representation (|6.26|l . of the qubit follows from the general 
form 

|^)=«|0)+/3|1) 
by writing a — e'^ cos | and (3 — e*^e*"^ sin | so that 
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Figure 6.5: The Bloch sphere. 



|V^)=e^^(cos-|0)+e'*sm-|l)) 

and then dropping the overaU unphysical phase factor e*'' . Thus the number of 
real physical degrees of freedom of a qubit is 2. 

A general rotation by an angle a about an axis h — (nx,ny,nz) is given by 
the operator Rn{a) 

Rfiia) = exp(— iafi • ct/2) ~ cos ^i^I ~ *sin {n-xX + UyY + UzZ). (6.27) 

6.2.6 Single qubit phase-shift operators 

The following phase-shift operators are sometimes useful 

E{a) = (^J =P(a/2)i?,(a). (6.29) 

6.2.7 Some special controlled operations 

Two of the most important and useful controlled operations are the two-qubit 
CNOT-gate and the three-qubit Toffoli-gate. 
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Controlled NOT 



The CNOT-gate is an example of a two-qubit controlled operation. It also goes 
under the name (quantum) XOR. Its matrix representation in the computational 
basis is 



CNOT = Ai{X) 



^1 0^ 

10 

1 

Vo 1 0. 



(6.30) 




Figure 6.6: Controlled NOT gate. 

The CNOT-gate performs a NOT (i.e. an X) operation on the target bit t 
conditioned on the control bit c being 1.^ 

Toffoli gate 

The Toffoli gate is A2{X). As noted in chapter 2, it is universal for classical 
reversible computation. 



a 



a 



t t e {ab) 



Figure 6.7: The Toffoli gate. 



^Inevitably, language tends to get sloppy, and "bit" will be used when it is really the qubit 
basis states that are refereed to. 
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6.2.8 Some practical "machinery" 

The relation between a quantum circuit and the corresponding unitary matrix is 
not entirely straightforward to figure out. For the benefit of the reader we here 
record some "practical tools of the trade" often left implicit in the literature. 

Gates acting independently 

Let G denote a general single qubit gate. Then the action of G on the j -th 
qubit on a multi-qubit quantum state is denoted by G(j) and defined by 



G{i)\xiX2 ■ ■ ■ Xj ■ ■ ■ Xn) = \X1X2 ■ --Xj-i) (g> {G{j)\Xj)) (g) \Xj + i ■ ■ ■ x^) . (6.31) 

The generalization to several single qubit gates acting on different qubits 
in the register is straightforward. In particular, specializing to the case of two 
single qubit gates acting independently on a two-qubit state, we have 

A{1) ® B{2)\xiX2) = A{l)B{2)\xiX2) = {A{l)\xi)) ® {B{2)\x2)). (6.32) 



Xi 



A 



X2 



B 



Figure 6.8: Two gates acting independently. 
In order to work out this equation explicitly, let the states be given by 



ki) =ai|0)+/3i|l) = 
so that the composite state is 



N2) =«2|0)+/32|1) 



Q!2 
P2 



(6.33) 



\x1X2) 



Next, let the operators be 




(6.34) 
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The right hand side of equation (|6.32l) becomes 



ail 

021 



0-12 
0-22 



bii 
hi 



bi2 

b22 



(6.36) 



/ (aiiai + ai2/3i)(6iia2 + 612/32) ' 

(aiiai + ai2l3i){b2ia2 + 622/32) 

(a2iQ;i + a22f3i){biia2 + 612/32) 
V (a2iQ!i + 022/^1) (62ia2 + 622/32) . 

The operator product in the left hand side of H6.32|l must be defined so that 
it reproduce this expression. In analogy with the definition of the (8) product of 
vectors in (16.341). it is natural to define 



A<SiB 

faiihii 
aii62i 
021611 

V 021621 



aiiB auB 
a2iB a22B 

011612 012611 

011622 012621 

021612 022611 

021622 022621 




(6.37) 



Then the left and side of H6.32|l becomes 
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(6.38) 



Multiplying out H6.36|l and (|6.38|) and doing some careful bookkeeping, shows 
the equality of these two expressions. 

Two special cases are useful to note in order to gain some intuition. When 
A = the unity matrix, we get in block diagonal form 



1®B = 
On the other hand, when B ^ I 



A(g)I ■ 



B 
B 



(6.39) 
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(6.40) 



Note that general controlled operations cannot be achieved by direct prod- 
ucts of operators, i.e. by operators acting independently. This can be seen by 
inspecting the explicit product A(E) B in 16.371 
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Explicit controlled operations 

In the notation Aj„(J7), the target qubit is supposed to be the qubit with lowest 
place value in the binary number xi . . . XmV, that is y. However, the target can 
be any of the qubits. In that case some care must be exercised when writing 
out an explicit matrix representation. A simple example will illustrate this. Let 
U be the 2x2 unitary matrix 



Then the controlled operation Ai([/) of figure 



u 



Figure 6.9: Conditioning on the first qubit. 
has the explicit form 
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On the other hand, conditioning on the second qubit as in figure IHTTIll 



u 



Figure 6.10: Conditioning on the second qubit. 
we get the explicit matrix representation 
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From the form of the matrix it is clear that it acts non-trivially on the 
computational basis states |01) and |11), i.e. conditioned on the second qubit 
being 1. 



A note and directions in n-qubit space 

A qubit is two dimensional, therefore an rt-qubit state is 2"-dimensional. Often, 
one is interested in lower dimensional subspaces of the full space. It is convenient 
to use the term level to point out a particular direction in this space. For 
instance, consider a 3-qubit computational basis state |s) with binary expansion 
IS1S2S3) and explicit vector representation 



/eo\ 

ei 



where all Ci = except eg = 1. Thus, a certain computational basis state \s) 
points out direction e^ in the full Hilbert space. Such a direction is sometimes 
called a level. Lower dimensional unitary matrixes can act on subspaces of levels 
of the full space. 

Order of multiplication 

When consecutive gates are applied to a qubit as in figure IjCllll , the gates are 
applied from left to right. This is clear, since following the circuit from left to 
right, the gate A is first applied to the qubit \x), then B and finally C, resulting 
in the state CBA\x). 





Ax 




BAx 




A 




B 




c 







CBAx 



Figure 6.11: Order of multiplication for consecutive gates. 



6.2.9 Entanglement 

It is important to realize that the quantum states entering and leaving a gate 
might be entangled. Entanglement is a property of composite quantum systems. 
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A state of a composite quantum state is entangled when the state cannot be 
decomposed as product of states of the component systems. In fact, a general 
n-qubit state 

as in eauation l6.7l is said to be decomposable when it can be expressed in terms 
of a tensor product 

\X1X2 ■ ■ - Xn) = \xi) \X2) (8) • • ■ \Xn) 

of component qubit states \xi) ~ ai\0) + /3i|l). A state is entangled if and only 
if it is non-decomposable. 

The concept is easy to illustrate in the case of a two qubit system. The 
computational basis of such a system is 

|00), |01), |10), 111). 
Consider then the state 

a|00) + ^|ll). (6.41) 
By studying the coefficients in the expansion of the product of two qubits 

(ai|0)+/3i|l))(a2|0)+/32|l)) = 
aiazlOO) + ai/32|01) + /JiaajlO) + Pip2\n) (6.42) 

it is clear that there is no way to choose the coefficients ai, a2, Pi and P2 so 
that aia2 7^ 0, aiP2 = 0, (3ia2 — and P1P2 7^ simultaneously in order to 
reproduce the entangled state H6.41|l as a product of single qubit states. 

The entangled state (|6.41|) can be produced by the application of the CNOT- 
gate to a product state. The following figure illustrates an example. 



|0>+|1> 




|01>+|01> 



Figure 6.12: Entangling a two-qubit product state using CNOT. 

Working with un-normalized states for simplicity, we have the incoming 
product state 
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(|0) + |1))|1) = |01) + |11) = 




Applying the CNOT gate yields 



10 

10 

1 

>0 1 





= |01) + |10). 



It is interesting and important to note that the CNOT gate cannot be written 
as a (8) - product of two single qubit gates. This is clear from studying the matrix 
elements of the product B in equation l|fi.37|l . In fact, there is no way to 
produce entanglement using only single qubit gates. This observation will be 
put in context in section X.X.X on universal quantum gates. 

Mathematically, entanglement is quite trivial, but the concept is far from 
trivial from a physical point of view, and has been a subject of discussion since 
the mid nineteen thirties. We will return briefly to this discussion in the last 
chapter. 



6.2.10 Some important gate constructions 

We need to able to build complicated gates out of simpler ones. These simple 
gates are, apart from the special discrete one-qubit gates, also general one-qubit 
gates U and CNOT gates. We will call these basic or elementary operations. 
Here follows a few useful constructions. 



Decomposition of a single qubit gate into Z and Y rotations 

Every unitary 2x2 matrix U can be expressed as 

U = e'''RMRyil)MS), (6.43) 
in terms of the rotation operators 16.191 and 16.201 and where the parameters 
a, /3, 7 and S are real. 



Proof 

First note that the unitarity constraint UU^ = 1 on a general 2x2 complex 
matrix U reduces the number of real parameters from 8 to 4. Furthermore, a 
matrix is unitary if and only if its row vectors and column vectors are orthonor- 
mal. Therefore, every unitary 2x2 matrix can be expressed in terms of four 
real parameters a, (3, 7 and 6 as 



gi(a+/3/2-<5/2) gjj^ 2 gi(a+/3/2+5/2) 1 



(6.44) 
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For example, the column vectors are orthogonal, since 

^g,:(a-/3/2-5/2) 2)* ( _ e.(a-/3/2+5/2) 
(g^(a+;3/2-5/2) ^-^ 2)* ^^^^^+P/2+S/2) cos 2 ) = 

ri 7 7 ,5 7 7 
— e cos — sin — he sin — cos — = 0. 
2 2 2 2 

The rest of the conditions on the matrix IB. 441 can be checked similarly. 
Multiplying out eauation l6.43l yields exactly the matrix in 16.441 



Decomposition of a general Ai(f/) gate 

The above result allows a decomposition of general Ai ([/) gate in terms of single 
qubit gates and CNOT's. 

Let U be single qubit unitary gate. Then there exist single qubit operators 
A, B and C such that 



ABC = I (6.45) 

e^'^AXBXC ^ U. (6.46) 

Proof 

In terms of the rotation operators H6.18|l . (|6.19|) and (|6.2U|) . put 

A^RMRyil/^), 
B^Ry{-j/2)R,{~{6 + (3)/2), 
C^R,{{6-f3)/2). 

Then 

ABC = R,{/3)Ry{^/2)Ry{-^/2)R,{-{5 + I3)/2)R,{{5 - /3)/2) = /. 
Next, using the identities 16 . 241 WJI^ as well as = I 

XBX = XRy{^^/2)XXR,{-{5 + /3)/2) = Ry{^/2)R,({5 + /3)/2), 
so that 

AXBXC = i?,(/3)i?,(7/2)i?,(7/2)i?,((<5 + /3)/2)i?,((5 - /3)/2) = 

R^{l3)Ry{-i)R,{6). 

Thus U can be decomposed as [/ = e"" AXBXC with ABC=I. 
These equations allow a decomposition of a general controUed-C/ operation 
as in figure IBTT^ When the control qubit is |0), the operation ABC is applied to 
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A 



B 



-e- 



c 



Figure 6.13: Simulation of a controUed-U operation in terms of single qubit 
operations and CNOT's. 



the target qubit. On the other hand, when the control qubit is |1), the operation 
P{a)AXBXC is applied to the target qubit. 

In order to obtain a somewhat more simple diagrammatic representation, 
the circuit identity of figure 16.141 is useful. This identity can be derived using 
the methods of section 6.3.7. Using the identity, we get the circuit of figure 



E(a) 



P(a) 



Figure 6.14: Identity between Ai (P(a)) and E{a) (8) /. 
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u 



A 



C 



Figure 6.15: Simplified decomposition of a controlled-U operation in terms of 
single qubit operations and CNOT's. 

Thus, the operation Ai(?7) can be decomposed into six basic operations. 



Decomposition of a general h.2{U) gate 

For any unitary 2x2 operation [/, a A2(J7) can be decomposed as in figure FTT?)! 
where ^ is a matrix that satisfies V'^ = U . 



Ci 



± 



U 



V 



V 



+ 



V 



Figure 6.16: Simulation of a A2{U) operation in terms of Ai{V) operations and 
CNOT's. 
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Proof 



The gate construction is proved correct by an examination of the four cases of 
combinations of basis states for the two target qubits |ci) and |c2) 

• When ci — and C2 — 0, I is apphed to the target. 

• When ci = and C2 — 1, VV'' — I is apphed to the target. 

• When ci = 1 and C2 = 0, V'^V ~ I is apphed to the target. 

• Finally, when ci = 1 and C2 = 1, VV — U is applied to the target. 

The circuit identities developed so far allow for the decomposition of a gen- 
eral A2{U) gate in terms of 16 single qubit gates and CNOT gates. A naive 
counting based on figures IB . 1 51 and 16.161 would imply the need to use 20 elemen- 
tary gates, but a closer examination shows that a few single qubit gates can be 
merged and eliminated. The details are left to the reader. 



Decomposition of a general A„_i([/) gate 

The decomposition of A2{U) can be generalized to more than 2 control qubits. 
In fact, for any 2x2 unitary operator U, the controlled operation A„_i([/) 
can be implemented using only Q{n^) elementary operations. There are several 
such constructions |^ . The one below is from [SJj . 

Consider the gate construction in figure 16.171 which is an example with 
n = 5. 
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Figure 6.17: Simulation of a ^^{U) operation. 

This is obviously a generalization of the decomposition of K2{U) and the 
proof is similar with V an operator such that — 1. 

To estimate the complexity of this decomposition, we need a construction 
of the generalized Toffoli gate A„_2(A"). There are constructions of these gates 
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of order 6(n) in the number of Toffoli gates and elementary gates used. The 
reader is referred to [SJ] for details, a paper that contains a wealth of gate 
constructions. 

The overall complexity C(n — 1) of A„_i(C/) can now be estimated. The 
cost of simulating the gates Ai(F) and Ai(F^) is constant, and the complexity 
of An-2{X) is 9(n). Then, the complexity of A„_2(^) is C{n — 2) by applying 
the construction of figure recursively. This yields a recurrence equation for 
the cost C(n) 

C{n ^ 1) = C{n - 2) + Q{n). 
Since {n — 1)^ — (n — 2)^ is in 6(n), it is clear that C{n) is in 0{n'^)J 



6.2.11 Decomposing general two-level unitary operation 
on n-qubit states 

Let J7 be a two-level unitary matrix acting on an n qubit state. The two levels 
involved can be any two directions in the full space of the state. Suppose the 
two directions are given by the computational basis states \s) = |siS2 • • ■ s„) and 
\t) = \t1t2 ■ ■ - tn) respectively. These two directions can differ in between 1 and 
n places. In order to write U in terms of a single qubit operation, swap oper- 
ations must be applied to yield two directions that differ in precisely one place 
corresponding to one particular qubit. The swap operations can be performed 
by generalized A„_i(X) operations. An example will clarify the situation. 
Consider a 3 qubit computer and a particular two level matrix U 



U = 



Denote by u the two-dimensional matrix 
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u' = 



The matrix U acts non-trivially on the computational basis states |001) and 
|110). Using so called Gray coding, the binary number 001 can be transformed 
by one-bit flips into 110 through the steps (they are not unique) 



001 -> 000 ^ 010 ^ 110 



The first two steps can be performed by the circuit 

^This solution is a particular solution. A more careful treatment of this equation shows 
that the solution to the homogenous equation, which in general yields exponential behavior, 
in this case gives a solution cl" = c, a constant. 
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Figure 6.18: Implementation of example Gray code transformation. 

The idea is to use generalized controlled NOT gates to transform |001) into 
1 101) which differ in the first qubit, and then apply U' to the first qubit con- 
ditioned on the second and third qubit, i.e. a generalized A2{U') . Finally, the 
swap operations are applied in reverse. The full circuit thus becomes as pictured 
in figure ISTTOI 



-o- 



-o- 



U' 



-o- 



-o- 



Figure 6.19: Example of decomposition of two- level unitary operation on 3-qubit 
state. 

The general case is a straightforward generalization of this procedure. 

The number of gates needed to achieve this decomposition can now be cal- 
culated. First, at most 2(n— 1) generalized CNOT operations A„_i(X) to swap 
the input state as described above, and then back again. Each such swap op- 
eration can be decomposed into 0{n) single qubit and CNOT gates according 
to section ??. The same holds for the A„_i(C/) operation, yielding an overall 
O(n^) complexity for this gate construction. 

These gate constructions will be used in the following sections discussing 
universality for the quantum circuit model. 

6.2.12 Universal sets of quantum gates 

In any computational model, ^ there is a choice as to what constitutes the basic 
programming primitives. In the non-reversible classical circuit model, NAND- 

* Except in the Gandy machine model )52| 
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gates together with FANOUT, is a universal set of gates. In the reversible clas- 
sical circuit model, the TofFoli gate is universal. Another choice is the Fredkin 
gate. Using either of these gates, any other logic operation as well as FANOUT 
can be performed. 

In the quantum circuit model, the situation is more complicated. The 
unitary quantum gates form a continuum and an N x N unitary matrix has 
N{N — l)/2 complex parameters Uij. The possible sets of universal operations 
are therefore much richer than in the classical case where the set of gates is 
discrete and finite. However, it can be shown that an arbitrary N x N unitary 
matrix U, acting on TV -dimensional vector space, can be expressed as a product 
lower-dimensional unitary matrices acting on subspaces of the vector space. In 
fact, U can be decomposed entirely in terms of matrices acting non-trivially 
on just two-dimensional subspaces. For a proof see 01] and [35] ■ Such a con- 
struction is however in general not efficient in terms of the number of required 
two-dimensional matrices, as the number of matrices needed is 0{N^), or more 
precisely 2"~^(2" — 1). Remember that in this context the dimension of the 
vector spaces are = 2".^ Furthermore, the matrices appearing in such a con- 
struction might not be possible to realize in a real physical system. What we 
need is a discrete set of standard gates out of which any unitary operation can 
be composed. In general we cannot expect such a construction to be exact, but 
rather obtained to within a certain approximation. 

6.2.13 Exact and approximate universality 

The universality results reported in the literature are somewhat bewildering. 
In order to navigate among them, two distinctions should be kept in mind. 
First, the distinction between exact universality and approximate universality. 
Secondly, the distinction between a finite set of standard gates and an infinite 
set of gates (parameterized by some variables). Note at once that a finite set of 
discrete standard gates can only be universal in the approximate sense, since the 
unitary matrices U{N) is a continuous set, while circuits built using a discrete 
set of gates can only generate a countable set of matrices. 

Thus exact universality requires circuits with an infinite number of discrete 
gates, or circuits using a finite number of gates continously parameterized by a 
set of variables. 

For practical purposes, approximate universality is the important concept, 
since any real quantum computer must presumable be built from a set of discrete 
standard gates. 

The two universality concepts, exact universality and approximate univer- 
sality, are defined below. 

^To avoid confusion, keep in mind that the number of wires n is equal to the number of 
qubits. But each qubit spans a 2-dimcnsional complex vector space, thus the linear dimension 
of the vectors and matrices is really N = 2" and N X N respectively. This is explicit when 
using the computational basis states. 
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Definition 

Let G be a set of gates with r elements, 

G = {Gi,„,,---,G,,„J, (6.47) 

where Gj_„^ is a gate that acts on rij qubits. The set is (approximately) universal 
if any unitary operator U acting on n input qubits can be decomposed into a 
finite product U of successive actions of gates Gj^Uj so that the error E{U, U) 

E{U, U) < e 

where the error is defined as 

EiU,V) = '^^^\\{U,V)\m- 

The maximum is taken over the set of all normalized states in the state space 
of the computer. 

In other words, there is a finite circuit (i.e. with a finite number of gates), 
approximating [/, built from the gates in G 

The set is exactly universal if a finite circuit exactly reproduces an arbitrary 
operation Un- In that case the set G must contain an infinite number of gates 
(in the sense referred to above). 

Otherwise, with a finite set G, it would take (in general) an infinitely large 
circuit, which would contradict the requirement that a computer must operate 
by finite means. 

Linear addition of errors 

With this definition of error it can be shown that errors add at most linearly 
Suppose Vi , V2 , . . . , Vm is a sequence of gates approximating another sequence 
of gates C/i, U2, ■ ■ ■ , Um, then 

m 

i=i 

6.2.14 Exact universality of two-level unitaries 

Let us restate the general decomposition result mentioned above. 

Let U be an iV-dimensional unitary matrix. Then it can be decomposed into 
a product of at most N{N — l)/2, 2-dimensional (two-level) unitary matrices 
|H) acting on two-dimensional subspaces of the full iV-dimensional space. 

Outline of proof 

Let U{N) be a general N x A^-dimensional unitary matrix. Let Tpg be x 
identity matrices with the elements Tpp, Tpg, Tqp and Tqq replaced with the 
corresponding elements from a certain 2x2 unitary matrix G, i.e. 
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Jpp — L^ll, -Lpq — ^12, -Lqp ~ '^21, -L qq ~ L'22 

This matrix performs a unitary transformation on the A^-dimensional state 
space, leaving an iV — 2-dimensional subspace unchanged. By a process analo- 
gous to Gauss-ehmination, these matrices can be used to make all non-diagonal 
matrix elements of U{N) zero by a judicious choice of parameters in the C- 
matrices. 

The matrix U{N) is muhipUed on the right by a succession of T-niatrices, 

U{N)Tn,n-iTn^n-2 ■ ■ ■ Tjy^i = 

where R{N) = Tn,n-iTn,n-2 ■ ■ ■ Tn,i 

Now this can be apphed recursively to the matrix U{N — 1) down to = 2, 
when we have obtained a diagonal matrix with phase factors on the diagonal. 
Applying one final diagonal phase matric D then yields 

U{N)R{N)R{N -!)■■■ R{2)D = I{N) 
and U (N) is obtained as 

U{N) = L>ti?(2)^ • • • R{N - l^RiN)^. 

The number of T-matrices used in this construction is (^) . 

Furthermore, there exist A^-dimensional unitary matrices that cannot be 
decomposed into less than N — 1 two-level unitaries. To see this, consider the 
case of a general diagonal unitary matrix. 

6.2.15 Summary of universality results 

One nice review of universal quantum gates can be found in j45| . 

The Deutsch gate 

In the first paper defining the quantum circuit model, Deutsch 25 showed that 
there is a 3-qubit universal quantum gate. The gate is of the form A2{U) where 
U = iRx{T^OL) and a is any irrational number. 

This gate is approximately universal. The proof will not be repeated here, 
but the result can be intuitively understood by noting that applying arbitrary 
integer powers p of Rxiira) to a qubit \u) yields 

R^inaflu) = e-'^^^^lu). 

Since a is irrational, the numbers mra/2 (mod 27r) approximates arbitrarily 
well any number A in the interval [0, 2tt[. Therefore any operation e"*^""^ can be 
arbitrarily well approximated. Therefore, all gates of the form A2(ie~*''"^"= ) can 
be approximated. In particular, since the Toffoli gate corresponds to A = 7r/2, 
it can be approximated. 



U{N-l) 
e*"' 
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Two-qubit universal gates 



Then DiVincenzo |3H1 showed that the Deutsch gate could be realized by a set 
of one and two bit gates. This result was improved by Barenco |47) who showed 
that a single two-bit gate is universal. This gate depends on three angular 
parameters </>, a and 9. They are irrational with respect to each other and to tt. 
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(6.48) 
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Universality of almost any two-qubit gate 

Although the Deutsch gate and the Barenco gate are of a certain form, it turns 
out that there is nothing special about these particular gates. In 49 and |^ 
it was shown that almost any n-bit gate with n > 2 is universal. The question 
as to which gates are not universal was clarified in [ST], a result which we will 
return to below. 



Single qubit and CNOT universality 

Relying on the fact that the Deutsch gate A2(?7) is universal, and the gate 
constructions of section If). 2. 101 where it was shown that any A2([/) gate can be 
decomposed in terms of single qubit and CNOT gates, it follows that single qubit 
and CNOT gates are universal for quantum computation. This is approximate 
universality, since the Deutsch gate is approximately universal. 

However it is possible to show that single qubit and CNOT gates are in fact 
exactly universal. This stronger results follows from the exact decomposition 
of a general A^-dimensional unitary matrix in terms of 2-dimensional unitaries 
acting on two-dimensional subspaces. Relying on the construction in section 
16.2.111 this can be simulated with single-qubit gates and CNOT-gates. 



6.2.16 Discrete sets of gates 

Finally we arrive at a discussion on finite sets of practical one qubit gates to- 
gether with CNOT gates. Relying on the previous discussion, it now suffices to 
effectively approximate general one-qubit gates with a discrete set of one-qubit 
gates. 

The standard set of gates consists of the Hadamard gate H, the tt/8 gate T 
and the phase gate S |53j . The phase gate is not really needed for universality, 
but it is needed in order to perform the approximations fault tolerantly, an 
important topic that will be briefly mentioned below. 

The proof that the gates H, 7r/8 and T are sufficient to approximate any 
single qubit gate, is quite complicated. It will not be repeated here, the reader 
is instead referred to 
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Efficiency of the gate constructions 

In order to estimate the efficiency of this construction, consider an n-qubit 
circuit approximating a ?7(2") operation. Suppose the circuit requires f{n) 
gates, most of which need not be in the universal set of gates so that most of them 
must be approximated. Suppose furthermore that the total tolerance to errors 
is required to be e. Then each gate must be approximated to within an error 
S = e/ f{n) since the errors adds at most linearly. The overall efficiency is thus 
determined by the efficiency with which a single qubit gate can be approximated. 
Naively, one would suspect that roughly 6(1/(5) gates from the discrete set would 
be needed, resulting in an overall efficiency of Vl{f{n)/5) = ^}{f{n)/{e/f{n)) = 
n{f{nf/e). 

However, according to the Solovay-Kitaev theorem (which unfortunately is 
too complicated to prove here) a general single qubit gate can be approximated 
within an error 6 using 0{\og'^{l/ 5)) gates from the discrete set. The constant 
c is number approximately equal to 2. Therefore, the overall efficiency becomes 
C(/(n) \og'^ {f {n) / 5)) , which is a polylogarithmic increase in the number of gates 
as compared to original circuit. 

What about the function f{n)l An upper bound for this function can be 
calculated based on the gate constructions in sections 16.2.111 and 16.2.141 From 
the decomposition of an arbitrary J7(2") matrix in terms of two-level unitaries 
(section EkUl, we know that it takes 2"(2" - l)/2 gates, i.e. 0(4"). Then, 
according to section l5.2.11l each such two-level unitary gate (acting on the state 
space of n qubits) can be implemented using 0{n?) single qubit and CNOT 
gates. Therefore, the function f(n) is in 0(n^4"). 

Putting everything together we see that the overall complexity is 
C(n^4"log'^(n^4"/(5)), which is exponential in n. But the exponential behav- 
ior arises not from the approximation of single qubit gates by discrete single 
qubit gates, but from the complexity of breaking down general f/(2") matrices 
in terms of two-level matrices. Thus, fast quantum algorithms cannot rely on 
naive universality constructions. 

6.2.17 General results 

The particular universality results reviewed to above are beautifully subsumed 
under a general theorem [ST]. In order to state the theorem, the notion of an 
imprimitive gate must be defined. 

A two-qubit gate V is said to be primitive if it maps decomposable states 
into decomposable states, i.e. if \x) and \y) are qubits, then there are qubits \u) 
and \v) such that y|a;)|y) = |it)|w). V is imprimitive if it is not primitive. 

There is a simple condition to determine whether a gate is primitive or 
not. Let P be an operator that swaps the two qubits in a product state, i.e. 
P\x)\y) = \y)\x). Then it can be shown that V is primitive if and only if V can 
be expressed as 5 (g) T or as (S* T)P for some single qubit gates S and T, so 
that V acts as V\x)\y) = S\x)(g)T\y) or as V\x)\y) = S\y) <g)T\x). 
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Theorem 

Given a two-qubit gate V, the foUowing conditions are equivalent 

• the collection of all single qubit gates together with V is approximately 
universal 

• the collection of all single qubit gates together with V is exactly universal 

• V is imprimitive 

Of course, a discrete set of single qubit gates can never yield exact univer- 
sality. 
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